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SECRETED HUMAN PROTEINS 

This application claims the benefit of copending provisional application 
Serial No. 60/032,757, filed December 1 1, 1996, which is incorporated herein by 
reference. 

TECHNIC AL AREA OF THE INVENTION 

The invention relates to the area of proteins. More particularly, the 
invention relates to human secreted proteins. 

BACKGR OUND OF THE INVENTION 

Secreted proteins include such important proteins as growth factors, 
cytokines and their receptors, extracellular matrix proteins, and proteases. 
Nucleotide sequences encoding these proteins can be used to detect disease states in 
which such proteins are implicated and to develop therapeutics for such diseases. 
Thus, there is a need in the art for methods of identifying secreted proteins and the 
^nucleotide sequences which encode them. 

SUMMARY OF THE INVENTION 

It is an object of the invention to provide an isolated and purified human 

protein. 

It is yet another object of the invention to provide a fusion protein. 
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It is still another object of the invention to provide a preparation of 
antibodies. 

It is even another object of the invention to provide an isolated and purified 
subgenomic polynucleotide. 

It is yet another object of the invention to provide an isolated gene. 

It is a further object of the invention to provide a DNA construct for 
expressing all or a portion of a human protein. 

It is still another object of the invention to provide a host cell comprising a 
DNA construct. 

It is another object of the invention to provide a homologously recombinant 

cell 

It is even another object of the invention to provide a method of producing a 
human protein. 

It is another object of the invention to provide a method of identifying a 
secreted polypeptide which is modified by rough microsomes. 

These and other objects of the invention are provided by one or more of the 
embodiments described below. 

One embodiment of the invention provides an isolated and purified human 
protein. The isolated and purified human protein has an amino acid sequence 
selected from the group consisting of the amino acid sequences shown in SEQ ID 
Nos:20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, and 38. 

Another embodiment of the invention provides an isolated and purified 
human protein having an amino acid sequence which is at least 85% identical to an 
amino acid sequence selected from the group consisting of the amino acid 
sequences shown in SEQ ID Nos:20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 
33, 34,35,36,37, and 38. 

Still another embodiment of the invention provides a polypeptide comprising 
at least 6 contiguous amino acids of an amino acid sequence selected from the 
group consisting of the amino acid sequences shown in SEQ ID Nos:20, 21, 22, 23, 
24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, and 38. 



Even another embodiment of the invention provides a fusion protein. The 
fusion protein comprises a first protein segment and a second protein segment fused 
together by means of a peptide bond. The first protein segment consists of at least 
6 contiguous amino acids selected from the group consisting of the amino acid 
sequences shown in SEQ ID Nos:20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 
33, 34,35, 36,37, and 38. 

Yet another embodiment of the invention provides a preparation of 
antibodies. The antibodies specifically bind to a human protein having an amino 
acid sequence selected from the group consisting of the amino acid sequences 
shown in SEQ ID Nos:20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 
36, 37, and 38. 

Even another embodiment of the invention provides an isolated and purified 
subgenomic polynucleotide. The isolated and purified subgenomic polynucleotide 
has a nucleotide sequence selected from the group consisting of the nucleotide 
sequences shown in SEQ ID NOs:l, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 
17, 18, and 19. 

Yet another embodiment of the invention provides an isolated and purified 
subgenomic polynucleotide consisting of at least 10 contiguous nucleotides selected 
from the group consisting of the nucleotide sequences shown in SEQ ID NOs:l, 2, 
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, and 19. 

Still another embodiment of the invention provides an isolated gene. The 
isolated gene corresponds to a cDNA sequence selected from the group consisting 
of the nucleotide sequences shown in SEQ ID NOs:l, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 
12, 13, 14, 15, 16, 17, 18, and 19. 

Another embodiment of the invention provides a DNA construct for 
expressing all or a portion of a human protein. The DNA construct comprises a 
promoter and a polynucleotide segment. The polynucleotide segment encodes at 
least 6 contiguous amino acids of a human protein having an amino acid sequence 
selected from the group consisting of the amino acid sequences shown in SEQ ID 
Nos:20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, and 38. 



The polynucleotide segment is located downstream from the promoter. 
Transcription of the polynucleotide segment initiates at the promoter. 

Even another embodiment of the invention provides a host cell comprising a 
DNA construct. The DNA construct comprises a promoter and a polynucleotide 
segment. The polynucleotide segment encodes at least 6 contiguous amino acids of 
a human protein having an amino acid sequence selected from the group consisting 
of the amino acid sequences shown in SEQ ID Nos:20, 21, 22, 23, 24, 25, 26, 27, 
28, 29, 30, 31, 32, 33, 34, 35, 36, 37, and 38. The polynucleotide segment is 
located downstream from the promoter. Transcription of the polynucleotide 
segment initiates at the promoter. 

Still another embodiment of the invention provides a homologously 
recombinant cell having incorporated therein a new transcription initiation unit. The 
transcription initiation unit comprises in 5' to 3' order an exogenous regulatory 
sequence, an exogenous exon, and a splice donor site. The transcription initiation 
unit is located upstream to a coding sequence of a gene. The gene comprises a 
nucleotide sequence selected from the group consisting of the nucleotide sequences 
shown in SEQ ID NOs:l, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 
and 19. The exogenous regulatory sequence controls transcription of the coding 
sequence of the gene. 

Yet another embodiment of the invention provides a method of producing a 
human protein. A culture of a cell is grown. The cell comprises a DNA construct. 
The DNA construct comprises a promoter and a polynucleotide segment. The 
polynucleotide segment encodes at least 6 contiguous amino acids of a human 
protein having an amino acid sequence selected from the group consisting of the 
amino acid sequences shown in SEQ ID Nos:20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 
30, 31, 32, 33, 34, 35, 36, 37, and 38. The polynucleotide segment is located 
downstream from the promoter. Transcription of the polynucleotide segment 
initiates at the promoter. The protein is purified from the culture. 

Even another embodiment of the invention provides a method of producing 
a human protein. A culture of a cell is grown. The cell comprises a new 
transcription initiation unit. The transcription initiation unit comprises in 5' to 3' 



order an exogenous regulatory sequence, an exogenous exon, and a splice donor 
site. The transcription initiation unit is located upstream to a coding sequence of a 
gene. The gene comprises a nucleotide sequence selected from the group consisting 
of the nucleotide sequences shown in SEQ IDNOs:l, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
12, 13, 14, 15, 16, 17, 18, and 19. The exogenous regulatory sequence controls 
transcription of the coding sequence of the gene. The protein is purified from the 
culture. 

Another embodiment of the invention provides a method of identifying a 
secreted polypeptide which is modified by rough microsomes. A population of 
cDNA molecules is transcribed in vitro whereby a population of cRNA molecules is 
formed. A first portion of the population of cRNA molecules is translated in vitro 
in the absence of rough microsomes whereby a first population of polypeptides is 
formed. A second portion of the population of cRNA molecules is translated in 
vitro in the presence of rough microsomes whereby a second population of 
polypeptides is formed. The first population of polypeptides is compared with the 
second population of polypeptides. Polypeptide members of the second population 
which have been modified by the rough microsomes are detected. 

The present invention thus provides the art with a method for identifying 
secreted proteins or polypeptides, the amino acid sequences of nineteen novel 
human secreted proteins, and the nucleotide sequences which encode these proteins. 
The invention can be used to, inter alia, to produce secreted proteins for 
therapeutic and diagnostic purposes. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The inventors have discovered a method for identifying secreted proteins or 
polypeptides. Secreted proteins or polypeptides include soluble proteins which can 
be transported across a membrane, such as a cell membrane, nuclear membrane, or 
membrane of the endoplasmic reticulum, as well as proteins which can be partially 
secreted from a cell, such as membrane-bound receptors. 

Secreted proteins can contain a signal (or secretion leader) sequence, 
located at the N-terminus and including at least several hydrophobic amino acids, 



such as phenylalanine, methionine, leucine^ valine, or tryptophan. Non-hydrophobic 
amino acids can also be included in the signal sequence. Signal sequences are 
described in von Heijne, J. Mol Biol 184\99A05 (1985) and Kaiser and Botstein, 
Mol Cell Biol 5:2382-2391 (1986). Secreted proteins can also be glycosylated by 
post-translational modification. The presence of a signal sequence or the presence 
of glycosylation or both indicate that a particular protein is a secreted protein. 

In order to identify secreted proteins or polypeptides, the method of the 
invention exploits properties of microsomes, which are the closed vesicles that 
result from fragmentation of endoplasmic reticulum. Microsomes can be rough or 
smooth, depending on whether the endoplasmic reticulum from which they were 
derived is studded with ribosomes. Microsomes, particularly rough microsomes, 
have the ability to perform post-translational modifications, such as glycosylation 
and cleavage of signal sequences from proteins or polypeptides. 

To identify secreted proteins, a population of complementary DNA(cDNA) 
molecules is transcribed in vitro to synthesize a population of complementary RNA 
(cRNA) molecules. The cDNA molecules can be synthesized by reverse 
transcription of mRNA molecules isolated from a particular cell or tissue type or 
organism using, for example, a commercially available reverse transcriptase enzyme. 
Alternatively, the reverse transcription reaction to form cDNA molecules can be 
conducted on total RNA, without a preliminary purification of mRNA. 

Any organism, such as a bacterium, plant, invertebrate, or vertebrate 
organism, can be used as a source of RNA Particularly preferred sources of RNA 
are mammals, most preferably humans. Tissues, such as liver, brain, kidney, spleen, 
pancreas, or muscle, can be used as a source of RNA. Individual cell types, either 
primary cells or members of established cell lines, such as HeLa, CHO, PC 12, PI 9, 
BHK, COS, or HepG2, are suitable sources of RNA Tissues or primary cells 
isolated from organisms at a particular stage in development can be used as RNA 
sources. Stem cells, such as hematopoietic, neuronal, and embryonic stem cells, can 
also be used as a source of RNA. 

Total RNA or mRNA can be isolated using methods known in the art. Such 
methods are described, inter alia y in Sambrook et a/., MOLECULAR CLONING, A 



Laboratory Manual (2d ed., Cold Spring Harbor Press, N. Y., 1989), and 
Ausubel et al. 9 Current Protocols in Molecular Biology (Greene Publishing 
Associates and John Wiley & Sons, N.Y., 1994). Techniques for RNA isolation 
can be tailored for a particular organism or cell type, as is known in the art. 

Complementary DNA can optionally be obtained from a cDNA library. The 
cDNA library can be derived from the genome of any organism of interest, 
particularly a mammal or a human. Tissue- or cell type-specific cDNA libraries can 
also be used as a source of cDNA. 

Transcription of cDNA molecules in vitro to form cRNA molecules can be 
carried out using any methods known in the art. These methods include, for 
example, placing cDNA into a cloning vector containing a promoter, such as an 
SP6, T7, or T3 polymerase promoter, and transcribing the cDNA using the 
appropriate polymerase. A variety of commercial kits are available for this purpose. 

A first portion of the population of cRNA molecules can be translated in 
vitro, in the absence of rough microsomes, to form a first population of 
polypeptides which have not been post-translationally modified. A second portion 
of the population of cRNA molecules can be translated in vitro in the presence of 
rough microsomes. Under the conditions of the in vitro translation reaction, rough 
microsomes can cleave signal sequences from those polypeptides which comprise 
such sequences. Under the same conditions, rough microsomes can also glycosylate 
those polypeptides which contain glycosylation sites. 

Methods of in vitro translation are those which are known in the art, such 
as translation in a reticulocyte lysate system, particularly a rabbit reticulocyte lysate. 
Reticulocyte lysate systems can be assembled in the laboratory or purchased 
* commercially in kit form. 

Microsomes can be prepared by disruption of tissues or cells by 
homogenization, as is known in the art. If desired, rough and smooth microsomes 
can be separated using well-known techniques, such as sucrose density gradient 
sedimentation. Microsomes are also available commercially, for example, such as 
the canine pancreatic microsomes available from Promega Corp., Madison, WL 



The first population of polypeptides can then be compared with the second 
population of polypeptides. This comparison can be by means of, for example, one- 
or two-dimensional poiyacrylamide gel electrophoresis, as is known in the art. 
Polypeptides separated in the gels can be detected by any means known in the art, 
such as staining with copper, silver, Coomassie Brilliant Blue, amido black, fast 
green FCF, Ponceau S, or a chromophoric label. Separated proteins can also be 
visualized using radioactive, chemiluminescent, fluorescent, or enzymatic tags 
incorporated into the proteins before separation. 

The gels can be dried or the proteins can be transferred to membranes, such 
as polyvinylidene difluoride membranes. Either the gels or membranes themselves 
or photographs of the gels or membranes can be compared by eye. Alternatively, 
the gels or membranes can be scanned, for example, with a densitometer and 
analyzed with the aid of a computer. 

Polypeptide members of the second population of polypeptides, which have 
been modified by the rough microsomes, can be detected by any means available in 
the art. For example, a shift in the position of a polypeptide band can be observed, 
indicating an increase in molecular weight of a member of the second population 
compared with the corresponding polypeptide member of the first population. Such 
an increase in molecular weight indicates that the polypeptide member of the second 
population was glycosylated by the rough microsomes. 

A shift in the position of a polypeptide band indicating a decrease in 
molecular weight of a member of the second population compared with the 
corresponding polypeptide member of the first population can also be observed. 
This decrease in molecular weight indicates that the polypeptide member of the 
second population contained a signal sequence which was cleaved by the rough 
microsomes. 

Polypeptides which are modified by the rough microsomes are identified as 
secreted polypeptides. Optionally, quantities of cDNA molecules which encode 
secreted polypeptides can be obtained. Molecules of cDNA which encode 
polypeptides which are post-translationally modified by the rough microsomes can 
be placed into suitable vectors using standard recombinant DNA techniques and 
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used to transform host cells. Many vectors are available for this purpose, such as 
retroviral or adenoviral vectors and bacteriophage, as described below. 

Vectors comprising cDNA which encode secreted polypeptides can be 
introduced into host cells using techniques available in the art. These techniques 
include, but are not limited to, transferrin-polycation-mediated DNA transfer, 
transfection with naked or encapsulated nucleic acids, liposome-mediated cellular 
fusion, intracellular transportation of DNA-coated latex beads, protoplast fusion, 
viral infection, electroporation, and calcium phosphate-mediated transfection. 

The host cells can be any host cells which are capable of propagating cDNA 
molecules. A variety of host cells, for example immortalized cell lines such as 
HeLa, CHO, or HEK, are available for this purpose. 

Transformed host cells can be diluted serially and cultured to form individual 
colonies. Methods of culturing host cells and the media suitable for each host cell 
type are well known in the art. Preferably, each colony originates from a single 
transformed host cell. Separate preparations of cDNA from each colony can be 
prepared, as described above, and transcribed in vitro to form cKNA. The cRNA 
can be transcribed to form secreted polypeptides, which can be purified as is known 
in the art. If the preparation of secreted polypeptides from a colony contains more 
than one species of polypeptide, the steps described above can be repeated until a 
colony is obtained which contains cDNA encoding only a single species of 
polypeptide. 

Complementary DNA molecules which encode secreted proteins can be 
sequenced using standard nucleotide sequencing techniques. The sequence of each 
cDNA molecule can be compared with known sequences in a database to determine 
whether the clone encodes a known or a novel secreted protein. 

The inventors have used the method of the invention to identify nineteen 
novel human secreted proteins. Amino acid sequences for these nineteen human 
secreted proteins are disclosed in SEQ ID Nos:20, 21, 22, 23, 24, 25, 26, 27, 28, 
29, 30, 31, 32, 33, 34, 35, 36, 37, and 38. Nucleotide sequences which encode the 
proteins are disclosed in SEQ ID NOs:l, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 
15, 16, 17, 18, and 19, respectively. 



Clones containing the cDNAs of the secreted proteins were deposited on 
December 1 1, 1997, with the ATCC. Individual bacterial cells (E. coli) in this 
composite deposit contain one or more of the polynucleotides encoding the secreted 
proteins of the invention and can be retrieved using an oligonucleotide probe 
designed from the sequence for that particular polynucleotide, as provided herein. 
Each polynucleotide can be removed from the vector by performing an EcoRI/NotI 
digestion (5' site, EcoRI; 3* site, NotI). The deposit submitted to the ATCC has 
been designated SECP120997. The nucleotide sequences of these deposits and the 
amino acid sequences they encode are controlling in the event of a discrepancy 
between the amino acid and nucleotide sequences disclosed herein and those 
contained in the deposits. 

A purified and isolated subgenomic polynucleotide of the present invention 
comprises at least 10, 12, 15, 18, 20, 25, 30, 35, 40, 45, or 50 contiguous 
nucleotides selected from the nucleotide sequences shown in SEQ ID NOs:l, 2, 3, 
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, and 19. The isolated and purified 
subgenomic polynucleotides can comprise an entire nucleotide sequence selected 
from the nucleotide sequences shown in SEQ ID NOs:l, 2, 3, 4, 5, 6, 7, 8, 9, 10, 
11, 12, 13, 14, 15, 16, 17, 18, and 19. 

Subgenomic polynucleotides contain less than a whole chromosome and are 
preferably intron-free. Polynucleotides of the invention can be isolated and purified 
free from other nucleotide sequences by standard nucleic acid purification 
techniques, using restriction enzymes and probes to isolate fragments comprising 
the coding sequences. 

Isolated genes corresponding to the cDNA sequences disclosed herein are 
also provided. Known methods can be used to isolate the corresponding genes 
using the provided cDNA sequences. These methods include preparation of probes 
or primers from the nucleotide sequences shown in SEQ ID NOs:l, 2, 3, 4, 5, 6, 7, 
8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, and 19 for use in identifying or amplifying 
the genes from human genomic libraries or other sources of human genomic DNA. 

The coding sequences shown in SEQ ID NOs:l, 2, 3, 4, 5, 6, 7, 8, 9, 10, 
11, 12, 13, 14, 15, 16, 17, 18, and 19 can be made using reverse transcriptase with 
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human mRNA as a template. Amplification by PCR can also be used to obtain the 
polynucleotides, using either genomic DNA or cDNA as a template. Polynucleotide 
molecules of the invention can also be made using the techniques of synthetic 
chemistry given the sequences disclosed herein. The degeneracy of the genetic code 
permits alternate nucleotide sequences which will encode the amino acid sequences 
shown in SEQ ID Nos:20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 
36, 37, and 38 to be synthesized. All such nucleotide sequences are within the 
scope of the present invention. 

Polynucleotide molecules of the invention can be propagated in vectors and 
cell lines as is known in the art. Polynucleotide molecules can be on linear or 
circular molecules. They can be on autonomously replicating molecules or on 
molecules without replication sequences. For propagation, polynucleotides of the 
invention can be introduced into suitable host cells using any techniques available in 
the art, as described above. 

Subgenomic polynucleotides of the invention can be used to propagate 
additional copies of the polynucleotides or to express protein, polypeptides, or 
fusion proteins. The subgenomic polynucleotides disclosed herein can also be used, 
for example, as biomarkers for tissues or chromosomes, as molecular weight 
markers for DNA gels, to elicit immune responses, such as the formation of 
antibodies against single- or double-stranded DNA, and in DNA-ligand interaction 
assays, to detect proteins or other molecules which interact with the nucleotide 
sequences. 

Disease states may be associated with alterations in the expression of genes 
which encode proteins of the invention. Polynucleotide sequences disclosed herein 
v can also be used to determine the involvement of any of these sequences in disease 
states. For example, a gene in a diseased cell can be sequenced and compared with 
a wild-type coding sequence of the invention. Alternatively, nucleotide probes can 
be constructed and used to detect normal or altered (mutant) forms of mRNA in a 
diseased cell. Subgenomic polynucleotides of the invention can also be used to 
design diagnostic tests and therapeutic compositions for diseases which may be 
associated with altered expression of these genes. 
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The present invention provides both full-length and mature forms of the 
disclosed proteins. Full-length forms of the proteins have the amino acid sequences 
shown in SEQ ID Nos:20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 
36, 37, and 38. The full-length forms of a protein can be processed enzymatically 
to remove a signal sequence, resulting in a mature form of the protein. Signal 
sequences can be identified by examination of the amino acid sequences disclosed 
herein and comparison with amino acid sequences of known signal sequences (see, 
e.g., vonHeijne, 1985; Kaiser & Botstein, 1986). Similarly, transmembrane 
domains can be identified by examination of the amino acid sequences disclosed 
herein. A transmembrane domain typically contains a long stretch of 15-30 
hydrophobic amino acids. 

Other domains with predicted functions can also be identified. For example, 
the protein having the amino acid sequence shown in SEQ ID NO: 23 comprises a 
Kunitz type serine protease inhibitor domain spanning amino acids 68 to 122 of 
SEQ ID NO: 23. The protein having the amino acid sequence shown in SEQ ID 
NO: 20 contains a zinc-finger motif. 

Allelic variants of the disclosed subgenomic polynucleotides can occur and 
encode proteins which are identical, homologous, or substantially related to amino 
acid sequences disclosed herein (see below). 

Allelic variants of subgenomic polynucleotides of the invention can be 
identified by hybridization of putative allelic variants with nucleotide sequences 
disclosed herein under stringent conditions. For example, by using the following 
wash conditions-2 x SCC, 0.1% SDS, room temperature twice, 30 minutes each; 
then 2 x SCC, 0.1% SDS, 50 °C. once, 30 minutes; then 2 x SCC, room 
temperature twice, 10 minutes each— allelic variants can be identified which contain 
at most about 25-30% basepair mismatches. More preferably, allelic variants 
contain 15-25% basepair mismatches, even more preferably 5-15% basepair 
mismatches. 

Protein variants of secreted proteins of the invention are also included. 
Amino acids which are not involved in regions which determine biological activity 
can be deleted or modified without affecting biological function. Preferably, protein 
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variants of the invention have amino acid sequences which are at least 85%, 90%, 
or 95% identical to the amino acid sequences disclosed herein and have similar 
biological properties (see below). More preferably, the molecules are 98% 
identical. Modifications of interest in the protein sequences can include the 
alteration, substitution, replacement, insertion or deletion of a selected amino acid 
residue. Proteins or derivatives can be either glycosylated or unglycosylated. 
Techniques for making such modifications are well known to those skilled in the art 
(see, e.g., U.S. 4,518,584). Alternatively, variants of proteins disclosed herein can 
be constructed using techniques of synthetic chemistry or using recombinant DNA 
methods. 

Preferably, amino acid changes in variants or derivatives of proteins of the 
invention are conservative amino acid changes, i.e., substitutions of similarly 
charged or uncharged amino acids. A conservative amino acid change involves 
substitution of one amino acid for another amino acid of a family of amino acids 
which are structurally related in their side chains. Naturally occurring amino acids 
are generally divided into four families: acidic (aspartate, glutamate), basic (lysine, 
arginine, histidine), non-polar (alanine, valine, leucine, isoleucine, proline, 
phenylalanine, methionine, tryptophan), and uncharged polar (glycine, asparagine, 
glutamine, cystine, serine, threonine, tyrosine) amino acids. Phenylalanine, 
tryptophan, and tyrosine are sometimes classified as aromatic amino acids. It is 
reasonable to expect that an isolated replacement of a leucine with an isoleucine or 
valine, an aspartate with a glutamate, a threonine with a serine, or a similar 
replacement of an amino acid with a structurally related amino acid will not have a 
major effect on the binding properties of the resulting molecule, especially if the 
replacement does not involve an amino acid at a binding site involved in an 
interaction of the protein. Non-naturally occurring amino acids can also be used to 
form protein variants of the invention. 

Whether an amino acid change results in a functional protein or polypeptide 
can readily be determined by assaying biological properties of the disclosed proteins 
or polypeptides, as described below. Species homologs of human subgenomic 
polynucleotides and proteins of the invention can also be identified by making 



13 



suitable probes or primers and screening cDNA expression libraries from other 
species, such as mice, monkeys, yeast, or bacteria. 

In the case of proteins which are membrane-bound, such as cell surface 
receptor proteins, soluble forms of the proteins can be obtained by deleting the 
nucleotide sequences which encode part or all of the intracellular and 
transmembrane domains of the protein and expressing a fully secreted form of the 
protein in a host cell. Techniques for identifying intracellular and transmembrane 
domains, such as homology searches, can be used to identify such domains in 
proteins of the invention using amino acid and nucleotide sequences disclosed 
herein. 

Polypeptides consisting of less than full-length proteins of the present 
invention are also provided. Polypeptides of the invention can be linear or can be 
cyclized, for example, as described in Saragovi etaL, 1992, Bio/Technology 10, 
773-778 and McDowell et al, 1992, J. Amer. Chem. Soc. 114, 9245-9253. 
Polypeptides can be used, for example, as immunogens, diagnostic aids, or 
therapeutics, and to create fusion proteins, as described below. 

Polypeptide molecules consisting of less than the entire amino acid 
sequences shown in SEQ ID Nos:20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 
33, 34, 35, 36, 37, and 38 are also provided. Such polypeptides comprise at least 6, 
8, 10, 12, 15, 18, or 20 contiguous amino acids of an amino acid sequence shown in 
SEQ ID Nos:20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 
and 38. Polypeptide molecules of the invention can also possess minor amino acid 
alterations which do not substantially affect the ability of the polypeptides to 
interact with specific molecules, such as antibodies. 

Derivatives of the polypeptides, such as glycosylated forms, aggregative 
conjugates with other molecules, and covalent conjugates with unrelated chemical 
moieties, are also provided. Derivatives also include allelic variants, species 
variants, and muteins. Covalent derivatives are prepared by linkage of 
functionalities to groups which are found in the amino acid chain or at the N- or C- 
terminal residue by means known in the art. Truncations or deletions of regions 
which do not affect biological function are also encompassed. Truncated or deleted 
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polypeptides can be prepared synthetically or recombinantly, or by proteolytic 
digestion of purified or partially purified secreted proteins of the invention. 

Fusion proteins comprising at least 6, 8, 10, 12, 15, 18, or 20 contiguous 
amino acids of the disclosed proteins can also be constructed. Human fusion 
proteins are useful, inter alia, for generating antibodies against amino acid 
sequences and for use in various assay systems. For example, fusion proteins can 
be used to identify proteins which interact with secreted proteins of the invention 
and influence their function. Physical methods, such as protein affinity 
chromatography, or library-based assays for protein-protein interactions, such as the 
yeast two-hybrid or phage display systems, can be used for this purpose. Such 
methods are well known in the art and can also be used as drug screens. Fusion 
proteins can also be used to target molecules to a specific location in a cell or to 
cause a molecule to be secreted or to be anchored in a cellular membrane. 

Fusion proteins of the invention comprise two protein segments which are 
fused together with a peptide bond. The first protein segment comprises at least 6, 
8, 10, 12, 15, 18, or 20 contiguous amino acids selected from an amino acid 
sequence shown in SEQ ID Nos:20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 3 1, 32, 
33, 34, 35, 36, 37, and 38. The first protein segment can also be a full-length 
protein (comprising a signal sequence) or a mature protein (lacking a signal 
sequence). The second protein segment can be a full-length protein or a protein 
fragment. The second protein or protein fragment can be labeled with a detectable 
marker, such as a radioactive, chemiluminescent, biotinylated, or fluorescent tag, or 
can be an enzyme which will generate a detectable product. Enzymes suitable for 
this purpose, such as p-galactosidase, are well known in the art. 

Techniques for making fusion proteins, either recombinantly or by 
covalently linking two protein segments, are well known in the art. Fusion proteins 
comprising amino acid sequences of the invention can also be constructed, for 
example, using standard recombinant DNA methods to make a DNA construct 
which comprises contiguous nucleotides selected from SEQ ID NOs:l, 2, 3, 4, 5, 6, 
7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, and 19 and encoding the desired amino 



acids in proper reading frame with nucleotides encoding the second protein 
segment. 

Proteins or polypeptides of the invention can be purified free from other 
components with which they are normally associated in a cell, such as 
carbohydrates, lipids, subcellular organelles, or other proteins. An isolated protein 
or polypeptide is at least 90% pure. Preferably, the preparations are 95% or 99% 
pure. The purity of a preparation can be assessed, for example, by examining 
electrophoretograms of protein or polypeptide preparations at several pH values 
and at several polyacrylamide concentrations, as is known in the art. 

Standard biochemical methods can be used to isolate proteins of the 
invention from tissues which express the proteins or to isolate proteins, 
polypeptides, or fusion proteins from recombinant host cells into which a DNA 
construct has been introduced. Methods of protein purification, such as size 
exclusion chromatography, ammonium sulfate fractionation, ion exchange 
chromatography, affinity chromatography, crystallization, electrofocusing, or 
preparative gel electrophoresis, are well known and widely used in the art. 

Alternatively, proteins, fusion proteins, or polypeptides of the invention can 
be produced by recombinant DNA methods or by synthetic chemical methods. 
Synthetic chemistry methods, such as solid phase peptide synthesis, can be used to 
synthesize proteins, fusion proteins, or polypeptides. For production of 
recombinant proteins, fusion proteins, or polypeptides, coding sequences selected 
from the nucleotide sequences shown in SEQ ID NOs:l, 2, 3, 4, 5, 6, 7, 8, 9, 10, 
11, 12, 13, 14, 15, 16, 17, 18, and 19 can be expressed in prokaryotic or eukaryotic 
host cells using expression systems known in the art. These expression systems 
include bacterial, yeast, insect, and mammalian cells (see below). 

The resulting expressed protein can then be purified from the culture 
medium or from extracts of the cultured cells using purification procedures known 
in the art. For example, for proteins fully secreted into the culture medium, cell-free 
medium can be diluted with sodium acetate and contacted with a cation exchange 
resin, followed by hydrophobic interaction chromatography. Using this method, the 
desired protein, fusion protein, or polypeptide is typically greater than 95% pure. 
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Further purification can be undertaken, using, for example, any of the techniques 
listed above. Proteins, fusion proteins, or polypeptides can also be tagged with an 
epitope, such as a "Flag" epitope (Kodak), and purified using an antibody which 
specifically binds to that epitope. 

It may be necessary to modify a protein produced in yeast or bacteria, for 
example by phosphorylation or glycosylation of the appropriate sites, in order to 
obtain a functional protein. Such covalent attachments can be made using known 
chemical or enzymatic methods. 

Proteins or polypeptides of the invention can also be expressed in cultured 
cells in a form which will facilitate purification. For example, a secreted protein or 
polypeptide can be expressed as a fusion protein comprising, for example, maltose 
binding protein, glutathione-S-transferase, or thioredoxin, and purified using a 
commercially available kit. Kits for expression and purification of such fusion 
proteins are available from companies such as New England BioLabs, Pharmacia, 
and Invitrogen. 

The coding sequences disclosed herein can also be used to construct 
transgenic animals, such as cows, goats, pigs, or sheep. Female transgenic animals 
can then produce proteins, polypeptides, or fusion proteins of the invention in their 
milk. Methods for constructing such animals are known and widely used in the art. 

Isolated proteins, polypeptides, or fusion proteins of the invention can be 
used to obtain a preparation of antibodies which specifically bind to epitopes 
comprising amino acid sequences of the invention. Antibodies of the invention can 
be used, for example, to detect proteins, polypeptides, or fusion proteins of the 
invention which are secreted into culture medium or to identify tissues or cells 
which express these molecules. The antibodies can be polyclonal or monoclonal or 
can be single chain antibodies. Techniques for raising polyclonal and monoclonal 
antibodies and for constructing single chain antibodies are well known in the art. 

Antibodies of the invention bind specifically to epitopes comprising amino 
acid sequences of the invention, preferably to epitopes not present on other 
proteins. Typically a minimum number of contiguous amino acids to encode an 
epitope is 6, 8, or 10. However, more amino acids can be part of an epitope, for 
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example, at least 15, 25, or 50, especially to form epitopes which involve non- 
contiguous residues. Specific binding antibodies do not detect other proteins on 
Western blots of proteins or in immunocytochemical assays. Specific binding 
antibodies provide a signal at least ten-fold lower than the signal provided with 
epitopes which do not comprise amino acid sequences of the invention. Antibodies 
which bind specifically to secreted proteins of the invention include those that bind 
to mature or full-length proteins, to polypeptides or degradation products, to fusion 
proteins, or to protein variants. In a preferred embodiment of the invention, the 
antibodies immunoprecipitate the desired protein, fusion protein, or polypeptide 
from solution and react with the protein, fusion protein, or polypeptide on Western 
blots of polyacrylamide gels. 

Techniques for purifying antibodies are those which are available in the art. 
In a preferred embodiment, antibodies are affinity purified by passing the antibodies 
over a column to which amino acid sequences of the invention are bound. The 
bound antibody is then eluted, for example using a buffer with a high salt 
concentration. Any such technique may be chosen to purify antibodies of the 
invention. 

The invention also provides DNA constructs, for expressing all or a portion 
of a protein of the invention in a host cell. The DNA construct comprises a 
promoter which is functional in the particular host cell selected. The skilled artisan 
can readily select an appropriate promoter from the large number of cell type- 
specific promoters known and used in the art. The DNA construct can also contain 
a transcription terminator which is functional in the host cell. 

The expression construct comprises a polynucleotide segment which 
encodes all or a portion of a human protein encoded by SEQ ID NOs:l, 2, 3, 4, 5, 
6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, and 19 or a variant thereof. The 
polynucleotide segment is located downstream from the promoter. Transcription of 
the polynucleotide segment initiates at the promoter. DNA constructs can be linear 
or circular and can contain sequences, if desired, for autonomous replication. 

The host cell comprising the DNA construct can be any suitable prokaryotic 
or eukaryotic cell. Expression systems in bacteria include those described in Chang 
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etal, Nature (1978) 275: 615; Goeddel etal, Nature (1979) 281: 544; Goeddel et 
al, Nucleic Acids Res. (1980) 8: 4057; EP 36,776; U.S. 4,551,433; deBoer etal, 
Proc. Natl. Acad. Sci. USA (1983) 80: 21-25; and Siebenlist etal, Ce//(1980) 20: 
269. 

Expression systems in yeast include those described in Hinnen et al, Proc. 
Natl. Acad Sci. USA (1978) 75: 1929; Ito etal, J. Bacteriol (1983) 153: 163; 
Kurtz et al, Mol Cell. Biol (1986) 6: 142; Kunze et al, J. Basic Microbiol 
(1985) 25: 141; Gleeson etal, J. Gen. Microbiol (1986) 132: 3459, Roggenkamp 
etal, Mol Gen. Genet. (1986) 202 .302); Das etal, J. Bacteriol. (1984) 158: 
1 165; De Louvencourt et al, J. Bacteriol (1983) 154: 737, Van den Berg et al, 
Bio/Technology (1990) 8: 135; Kunze e/a/., J. Basic Microbiol (1985) 25: 141; 
Cregget al, Mol Cell Biol (1985) 5: 3376; U.S. 4,837,148; U.S. 4,929,555; 
Beach and Nurse, Nature (1981) 300: 706; Davidow er al, Curr. Genet. (1985) 70: 
380; Gaillardin etal, Curr. Genet. (1985) 10: 49; Ballance ef al, Biochem. 
Biophys. Res. Commun. (1983) 112: 284-289; Tilburn etal, Gene (1983) 26: 205- 
22;, Yelton etal, Proc. Natl Acad Sci. USA (1984) 81: 1470-1474; Kelly and 
Hynes, EMBO J. (1985) 4: 475479; EP 244,234; and WO 91/00357. 

Expression of heterologous genes in insects can be accomplished as 
described in U.S. 4,745,05 1; Friesen et al (1986) "The Regulation of Baculovirus 
Gene Expression" in: The Molecular Biology of B aculoviruses (W. Doerfler, 
ed.); EP 127,839; EP 155,476; Vlak etal, J. Gen. Virol. (1988) 69: 765-776; 
Miller et al, Ann. Rev. Microbiol (1988) 42: 177; Carbonell et al, Gene (1988) 
73: 409; Maeda etal, Nature (1985) 315: 592-594; Lebacq-Verheyden etal, Mol. 
r Cell Biol. (1988) 8: 3129; Smith etal, Proc. Natl Acad Sci. USA (1985) 82: 
8404; Miyajima etal, Gene (1987) 58: 273; and Martin etal, DNA (1988) 7:99. 
Numerous baculoviral strains and variants and corresponding permissive insect host 
cells from hosts are described in Luckow et al, Bio/Technology (1988) 6: 47-55, 
Miller et al, in GENERIC ENGINEERING (Setlow, J.K. etal eds.), Vol. 8 (Plenum 
Publishing, 1986), pp. 277-279; and Maeda et al, Nature, (1985) 315: 592-594. 

Mammalian expression can be accomplished as described in Dijkema etal, 
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EMBO J. (1985) 4: 761; Gorman etal., Proc. Natl Acad Sci. USA (1982b) 79; 
6777; Boshart et a!., Cell (1985) 41; 521; and U.S. 4,399,216. Other features of 
mammalian expression can be facilitated as described in Ham and Wallace, Meth 
Enz. (1979) 58; 44; Barnes and Sato, Anal Biochem. (1980) 102: 255; U.S. 
4,767,704; U.S. 4,657,866; U.S. 4,927,762; U.S. 4,560,655; WO 90/103430, WO 
87/00195, and U.S. RE 30,985. 

DNA constructs of the invention can be introduced into host cells using any 
technique known in the art. These techniques include transferrin-polycation- 
mediated DNA transfer, transfection with naked or encapsulated nucleic acids, 
liposome-mediated cellular fusion, intracellular transportation of DNA-coated latex 
beads, protoplast fusion, viral infection, electroporation, and calcium phosphate- 
mediated transfection. 

Alternatively, expression of an endogenous gene encoding a protein of the 
invention can be manipulated by introducing by homologous recombination a DNA 
construct comprising a transcription unit in frame with the endogenous gene, to 
form a homologously recombinant cell comprising the transcription unit. The 
transcription unit comprises a targeting sequence, a regulatory sequence, an exon, 
and an unpaired splice donor site. The new transcription unit can be used to turn 
the endogenous gene on or off as desired. This method of affecting endogenous 
gene expression is taught in U.S. 5,641,670, which is incorporated herein by 
reference. 

The targeting sequence is a segment of at least 10, 12, 15, 20, or 50 
contiguous nucleotides selected from the nucleotide sequences shown in SEQ ID 
NOs:l, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, and 19. The 
transcription unit is located upstream to a coding sequence of the endogenous 
gene. The exogenous regulatory sequence directs transcription of the coding 
sequence of the endogenous gene. 

Secreted proteins of the invention have a variety of uses. For example, 
secreted proteins can be used in assays to determine biological activities, such as 
cytokine, cell proliferation, or cellular differentiation activities, tissue growth or 
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regeneration, activin or inhibin activity, chemotactic or chemokinetic activity, 
hemostatic or thrombolytic activity, receptor/ligand activity, tumor inhibition, or 
anti-inflammatory activity. Assays for these activities are known in the art and are 
disclosed, for example, in U.S. 5,654,173, which is incorporated herein by 
reference. 

Proteins of the invention can also be used as biomarkers, to identify tissues 
or cell types which express the proteins, or a stage- or disease-specific alteration in 
protein expression. Proteins of the invention can be used in protein interaction 
assays, to identify ligands or binding proteins. Compounds which affect the 
biological activities of the secreted proteins or their ability to interact with specific 
ligands can be identified using proteins of the invention in screening assays. 
Proteins and antibodies of the invention can also be used to design diagnostic tests 
and therapeutic compositions for diseases which may be associated with altered 
expression of these proteins. Fusion proteins comprising, for example, signal 
sequences or transmembrane domains of the disclosed proteins, can be used to 
target other protein domains to cellular locations in which the domains are not 
normally found, such as bound to a cellular membrane or secreted extracellularly. 

Further objects, features, and advantages of the present invention will 
readily occur to the skilled artisan provided with the disclosure above. 

SYNOPSIS OF THE INVENTION 

1. An isolated and purified human protein having an amino acid 

. sequence selected from the group consisting of the amino acid sequences shown in 
SEQ ID Nos:20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 
and 38. 

2. An isolated and purified human protein having an amino acid 
sequence which is at least 85% identical to an amino acid sequence selected from 
the group consisting of the amino acid sequences shown in SEQ ID Nos:20, 21, 22, 
23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, and 38. 
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3. The isolated and purified human protein of item 2 wherein the amino 
acid sequence is at least 90% identical. 

4. The isolated and purified human protein of item 2 wherein the amino 
acid sequence is at least 95% identical. 

5. The isolated and purified human protein of item 2 wherein the amino 
acid sequence is at least 98% identical. 

6. An isolated and purified human polypeptide comprising at least 6 
contiguous amino acids of an amino acid sequence selected from the group 
consisting of the amino acid sequences shown in SEQ ID Nos:20, 21, 22, 23, 24, 
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, and 38. 

7. A fusion protein comprising a first protein segment and a second 
protein segment fused together by means of a peptide bond, wherein the first 
protein segment consists of at least 6 contiguous amino acids selected from the 
group consisting of the amino acid sequences shown in SEQ ID Nos:20, 21, 22, 23, 
24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, and 38. 

8. A preparation of antibodies which specifically bind to the human 
protein of item 1. 

9. The preparation of antibodies of item 8 wherein the antibodies are 
monoclonal. 

10. The preparation of antibodies of item 8 wherein the antibodies are 
polyclonal. 

11. The preparation of antibodies of item 8 wherein the antibodies are 
single chain antibodies. 

12. An isolated and purified subgenomic polynucleotide having a 
nucleotide sequence selected from the group consisting of the nucleotide sequences 
shown in SEQ ID NOs:l, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 
and 19. 

13. An isolated and purified subgenomic polynucleotide consisting of at 
least 10 contiguous nucleotides of a nucleotide sequence selected from the group 
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consisting of the nucleotide sequences shown in SEQ ID NOs:l, 2, 3, 4, 5, 6, 7, 8, 
9, 10, 11, 12, 13, 14, 15, 16, 17, 18, and 19. 

14. An isolated gene corresponding to a cDNA sequence selected from 
the group consisting of the nucleotide sequences shown in SEQ ID NOs:l, 2, 3, 4, 
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, and 19. 

15. A DNA construct for expressing all or a portion of a human protein 
having an amino acid sequence selected from the group consisting of the amino acid 
sequences shown in SEQ ID Nos:20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 
33, 34, 35, 36, 37, and 38, comprising: 

a promoter; and 

a polynucleotide segment encoding at least 6 contiguous amino acids 
of the human protein, wherein the polynucleotide segment is located downstream 
from the promoter, wherein transcription of the polynucleotide segment initiates at 
or 3* to the promoter. 

16. A host cell comprising a DNA construct comprising: 
a promoter; and 

a polynucleotide segment encoding at least 6 contiguous amino acids 
of a human protein having an amino acid sequence selected from the group 
consisting of the amino acid sequences shown in SEQ ID Nos:20, 21, 22, 23, 24, 
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, and 38, wherein the 
polynucleotide segment is located downstream from the pormoter and wherein 
transcription of the polynucleotide segment initiates at or 3* to the promoter. 

17. A homoiogously recombinant cell having incorporated therein a new 
v transcription initiation unit, wherein the new transcription initiation unit comprises 

in 5 1 to 3' order: 

(a) an exogenous regulatory sequence; 

(b) an exogenous exon; and 

(c) a splice donor site, 

wherein the transcription initiation unit is located upstream to a coding sequence of 
a gene, wherein the gene comprises a nucleotide sequence selected from the group 
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consisting of the nucleotide sequences shown in SEQ ID NOs: 1, 2, 3, 4, 5, 6, 7, 8, 
9, 10, 11, 12, 13, 14, 15, 16, 17, 18, and 19, and wherein the exogenous regulatory 
sequence controls transcription of the coding sequence of the gene. 

18. A method of producing a human protein, comprising the steps of: 
growing a culture of a cell comprising a DNA construct comprising 

(1) a promoter and (2) a polynucleotide segment encoding at least 6 contiguous 
amino acids of a human protein having an amino acid sequence selected from the 
group consisting of the amino acid sequences shown in SEQ ID Nos:20, 21, 22, 23, 
24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, and 38, wherein the 
polynucleotide segment is located downstream from the promoter and wherein 
transcription of the polynucleotide segment initiates at or 3' to the promoter; and; 

purifying the protein from the culture. 

19. A method of producing a human protein, comprising the steps of: 
growing a culture of a homologously recombinant cell having 

incorporated therein a new transcription initiation unit, wherein the new 
transcription initiation unit comprises in 5* to 3* order: 

(a) an exogenous regulatory sequence; 

(b) an exogenous exon; and 

(c) a splice donor site, 

wherein the transcription initiation unit is located upstream to a coding sequence of 
a gene, wherein the gene comprises a nucleotide sequence selected from the group 
consisting of the nucleotide sequences shown in SEQ ID NOs:l, 2, 3, 4, 5, 6, 7, 8, 
9, 10, 11, 12, 13, 14, 15, 16, 17, 18, and 19 and wherein the exogenous regulatory 
sequence controls transcription of the coding sequence of the gene; and 

purifying the protein from the culture. 

20. A method of identifying a secreted polypeptide which is modified by 
rough microsomes, comprising the steps of: 

transcribing in vitro a population of cDNA molecules whereby a 
population of cRNA molecules is formed; 
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translating a first portion of the population of cRNA molecules in 
vitro in the absence of rough microsomes whereby a first population of polypeptides 
is formed; 

translating a second portion of the population of cRNA molecules in 
vitro in the presence of rough microsomes whereby a second population of 
polypeptides is formed; 

comparing the first population of polypeptides with the second 
population of polypeptides; and 

detecting polypeptide members of the second population which have 
been modified by the rough microsomes. 

2 1 . The method of item 20 wherein the population of cDNA molecules 
is synthesized by reverse transcription of a population of mRNA molecules. 

22. The method of item 21 wherein the mRNA molecules are isolated 
from a mammal. 

23 . The method of item 22 wherein the mRNA molecules are isolated 
from a human. 

24. The method of item 20 wherein the population of cDNA molecules 
is obtained from a cDNA library. 

25. The method of item 24 wherein the cDNA library is derived from a 
mammalian genome. 

26. The method of item 25 wherein the cDNA library is derived from a 
human genome. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION 

(i) APPLICANT: Escobedo, Jaime 

Quianjin, Hu 
Garcia, Pablo 
Williams, Lewis T. 
Kothakota, Srinivas 

(ii) TITLE OF THE INVENTION: Secreted Human Proteins 

(iii) NUMBER OF SEQUENCES: 38 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Banner & Witcoff 

(B) STREET: 1001 G Street, NW 

(C) CITY: Washington 

(D) STATE: DC 

(E) COUNTRY: USA 

(F) ZIP: 20001 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: ll-DEC-1997 
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(C) CLASSIFICATION: 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 60/032757 

(B) FILING DATE: ll-DEC-1996 



(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Kagan, Sarah A 

(B) REGISTRATION NUMBER : 32141 

(C) REFERENCE /DOCKET NUMBER: 
k Q 2441. 39505; 1369 ,002; 1452. 001 

Id 

(ix) TELECOMMUNICATION INFORMATION: 
1J (A) TELEPHONE: 202-508-9100 

(B) TELEFAX: 202-508-9299 

: j 

I* (C) TELEX: 



iU 

fl| (2) INFORMATION FOR SEQ ID NO:l: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2063 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



( ix ) FEATURE : 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



GAATTCGGCA CGAGGCCTCA GTCTTCCAGG GCGGCGGTGG GTGTCCGCTT CTCTCTGCTC 60 

TTCGACTGCA CCGCACTCGC GCGTGACCCT GACTCCCCCT AGTCAGCTCA GCGGTGCTGC 120 

CATGGCGTGG CGGCGGCGCG AAGCCGGCGT CGGGGCTCGC GGCGTGTTGG CTCTGGCGTT 180 

GCTCGCCCTG GCCCTGTGCG TGCCCGGGGC CCGGGGCCGG GCTCTCGAGT GGTTCTCGGC 240 
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CGTGGTAAAC ATCGAGTACG TGGACCCGCA 
GAGTGGCCGC TTCGGCGACA GCTCGCCCAA 
GTGGGCGCCC GGCGGAGACC TCGAGGGCTG 
GCCCGGCGGC CGAGGGGCCG CGCCCTGGGT 
CAAGGACAAG GTGCTGGTGG CGGCGCGGAG 
GGAGCGCTAC GGGAACATCA CCTTGCCCAT 
CATTATGATT AGCTATCCAA AAGGAAGAGA 
AGTAACGATG ACCATAGGGG TTGGCACCCG 
TGTGGTGTTT GTGGCCATTG CCTTCATCAC 
ATTTTACTAT ATACAGCGTT TCCTATATAC 
AAAAGAAACT AAGAAAGTTA TTGGCCAGCT 
GGGAATTGAT GTTGATGCTG AAAATTGTGC 
O TATTATTAGA ATTCTGCCAT GCAAGCATAT 

"2 TTTGGATCAC CGAACATGTC CAATGTGTAA 

[y GGGAGAGCCT GGGGATGTAC AGGAGATGCC 

AGCTGCAAAT TTGAGTCTAG CTTTACCAGA 
Iq atcagcctcc CCTGCTGAAT CTGAGCCACA 

Q AGAAAATACG GCATTGCTAG AAGCCGGCAG 

" M CTAGCACACG TGCCCACTGA AGTGGCACCA 

m TTATTTTTTT TACTTTAGCA CATAATTTGT 

f U TATTAGATTC TGATTTGATA TACAAAGGAC 

GATTAGTCCT CATATATTTA TCTACTAAAA 
h h TCAGACTATT ACAAAGACAA CTGGGGCAGG 

TAAATAATTG GCTGCTATGG TTCTGTAAAA 
GCAAAGCACA TCAATGTTAG ACTAGTTGAA 
ATCTCATGGG CTTTCCCTGG AGGAAAGGTT 
AACTTGTAAA CTGAGATGTC TGTAGCTTTT 
AAAACCTGAG AGCACTTTTT CTTTGTTTAG 
GATTTGCATT TTTCCCTTTA TTGCCTCATT 
TGTTTATTTT TTCCTACAAA TAAAAAGCTA 
AAAAAAAAAA TTCCTGCGGC CGC 



GACCAACCTG 


ACGGTGTGGA 


GCGTCTCGGA 


300 


UCjAGGGCGCG 


CATGGCCTGG 


TGGGCGTCCC 


360 


lAjlAvCCCGAC 


ACGCGCTTCT 


TCGTGCCCGA 


420 


IAj(«>CCTGGTG 


GCTCGTGGGG 


GCTGCACCTT 


480 




GCCGTCGTCC 


TCTACAATGA 


540 


GTCTGACGCG 


GGAACAGGAA 


aa\ alaH ^ak aaaat ^bl asm .^^k ^Btb 

ATATAGTGGT 


600 


AATTTTGGAG 


CTGGTGCAAA 


AAGGAATTCC 


660 


GCATGTACAG 


GAGTTCATCA 


GCGGTCAGTC 


720 


CATGATGATT 


ATCTCGTTAG 


CCTGGCTAAT 


780 


TGGCTCTCAG 


ATTGGAAGTC 


AGAGCCATAG 


840 


TCTACTTCAT 


ACTGTAAAGC 


ATGGAGAAAA 


900 


AGTGTGTATT 


GAAAATTTCA 


AAGTAAAGGA 


960 


TTTTCATAGA 


ATATGCATTG 


ACCCATGGCT 


1020 


ACTTGATGTC 


ATCAAAGCCC 


TAGGATATTG 


1080 


TGCTCCAGAA 


TCTCCTCCTG 


GAAGGGATCC 


1140 


TGATGACGGA 


AGTGATGACA 


GCAGTCCACC 


1200 


GTGTGATCCC 


AGCTTTAAAG 


GAGATGCAGG 


1260 


GAGTGACTCT 


CGGCATGGAG 


GACCCATCTC 


1320 


ACAGAAGTTT 


GGCTTGAACT 


AAAGGACATT 


1380 


A 1 ATTTG AAA 


ATAATGTATA 


M*a?Vi#V*J ^bY aaBB| aai aTH bbbb^ .aau 

TTATTTTACC 


an A at a*Bb 

1440 


fT^ TV TV TV ft TV 4llf lltn 

lAAVvAlAJ. X X 


TCTTCTTGAA 


GAGACTTTTC 


1500 




TV Z™ 1 ! TV T\ TV TV 

ACCATGAACA 


GTGTGTTGCT 


1560 




fTI 7V TV TV /*% TV TV ^» 

X AAAGG AC AG 


GTGGTGTTTC 


1620 


ntUAb X X AA X 


X CX ATTTTTC 


AAGGTTTTTG 


1680 


GTGGAATTGT 


ATAATTCAAT 


TCGATAATTG 


1740 


TTTTTTGTTG 


1^ T TT 


AAGAACTTGA 


1800 


TTGCCCATCT 


GTAGTGTATG 


TGAAGATTTC 


1860 


AATTATGAGA 


AAGGCACTAG 


ATGACTTTAG 


1920 


TCTTGTGACG 


CCTTGTTGGG 


GAGGGAAATC 


1980 


AGATTCTATA 


TCGCAAAAAA 


AAAAAAAAAA 


2040 
2063 



(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1328 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 





GAATTCGGCA 


CGAGGTAGGC 


AAGGGATAAA 


AAGGCACCTA 


AGGCCCTTTT 


GCAATAAGAA 


60 




GCCAGATGGA 


TAAAGGAAGT 


GCTGGTCACC 


CTGGAGGTGT 


ACTGGTTTGG 


GG AAGGTCCC 


120 




CGGCCCCCAC 


AGCCCTCTGG 


GGAGCCTCAC 


CCTGGCTCTC 


CCCACTCACC 


TCAGCCCTCA 


180 




GGCAGCCCCT 


CCACAGGGCC 


CCTCTCCTGC 


CTGGACAGCT 


CTGCTGGTCT 


CCCCGTCCCC 


240 




TGGAGAAGAA 


CAAGGCCATG 


GGTCGGCCCC 


TGCTGCTGCC 


CCTGCTGCTC 


CTGCTGCAGC 


300 




CGCCAGCATT 


TCTGCAGCCT 


GGTGGCTCCA 


CAGGATCTGG 


TCCAAGCTAC 


CTTTATGGGG 


360 


f Ft 


TCACTCAACC 


AAAACACCTC 


TCAGCCTCCA 


TGGGTGGCTC 


TGTGGAAATC 


CCCTTCTCCT 


420 


, Fit 


TCTATTACCC 


CTGGGAGTTA 


GCCATAGTTC 


CCAACGTGAG 


AATATCCTGG 


AGACGGGGCC 


480 




ACTTCCACGG 


GCAGTCCTTC 


TACAGCACAA 


GGCCGCCTTC 


CATTCACAAG 


GATTATGTGA 


540 




ACCGGCTCTT 


TCTGAACTGG 


ACAGAGGGTC 


AGGAGAGCGG 


CTTCCTCAGG 


ATCTCAAACC 


600 


^y 


TGCGGAAGGA 


GGACCAGTCT 


GTGTATTTCT 


GCCGAGTCGA 


GCTGGACACC 


CGGAGATCAG 


660 


f* 


GGAGGCAGCA 


GTTGCAGTCC 


ATCAAGGGGA 


CCAAACTCAC 


CATCACCCAG 


GCTGTCACAA 


720 




CCACCACCAC 


CTGGAGGCCC 


AGCAGCACAA 


CCACCATAGC 


CGGCCTCAGG 


GTCACAGAAA 


780 


; b 5 
F: 5 


GCAAAGGGCA 


CTCAGAATCA 


TGGCACCTAA 


GTCTGGACAC 


TGCCATCAGG 


GTTGCATTGG 


840 


fU 


CTGTCGCTGT 


GCTCAAAACT 


GTCATTTTGG 


GACTGCTGTG 


CCTCCTCCTC 


CTGTGGTGGA 


900 


o 

•issr 


GGAGAAGGAA 


AGGTAGCAGG 


GCGCCAAGCA 


GTGACTTCTG 


ACCAACAGAG 


TGTGGGGAGA 


960 




AGGGATGTGT 


ATTAGCCCCG 


GAGGACGTGA 


TGTGAGACCC 


GCTTGTGAGT 


CCTCCACACT 


1020 




CGTTCCCCAT 


TGGCAAGATA 


CATGGAGAGC 


ACCCTGAGGA 


CCTTTAAAAG 


GCAAAGCCGC 


1080 




AAGGCAGAAG 


GAGGCTGGGT 


CCCTGAATCA 


CCGACTGGAG 


GAGAGTTACC 


TACAAGAGCC 


1140 




TTCATCCAGG 


AGCATCCACA 


CTGCAATGAT 


ATAGGAATGA 


GGTCTGAACT 


CCACTGAATT 


1200 




AAACCACTGG 


CATTTGGGGG 


CTGTTTATTA 


TAGCAGTGCA 


AAGAGTTCCT 


TTATCCTCCC 


1260 




CAAGGATGGA 


AAAATACAAT 


TTATTTTGCT 


TACCATAAAA 


AAAAAAAAAA 


AAAAATTCCT 


1320 



GCGGCCGC 1328 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1689 base pairs 

(B) TYPE: nucleic acid 

29 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: 

GAATTCGGCA CGAGGGCAAG ATTCGATACA 
TCACTTTCTT CATTCACAAT CCCAAGCGCC 
AGCACCAGTG TTCCCTGGGG AACCTGAAGG 
ACATGACTGT GAGCCAGCGC TTCCAGCTCA 
TGAAGATTGC CCTGCGGGTG CTCCATCTCG 
ACTCAGCTCA AGTCAAACGT CCCTCTGTGT 
CTCATATGTC TGGGTCTCCA GGCCCTGGTG 
S TTGGGGG CAG TGATAAGCCT GGTATGGAAG 

"VST 

^3 AGGGGCTGCA CGACCTGGGC AGAAGCTCCT 

fl CAGTCAAGGA GCCGACCCCC AGCATCGCCT 

|y AGCTGCGGCA AAGGCTGAGG CAGCTGGAAA 

^ GGCAG AT CCA GCTGACCATC CGGCACAGCT 

I ATGCCTGCAG AAACCTCATT GCCTTCTCTG 

C3 ATTTATTACC AGACAAGAGG CGGTCAGGAA 
TAAATCCAGT GTTTGATCAA AGCTT TG ATT 

fy GAACGCTCGA CGTTGCCGTG AAGAACAGTG 

£3 TTGGCAAAGT ATTGGTTGCT CTGGCATCTG 

s » 

ATGACCTCAC GGAAGATGGG ACGAGGCCTC 
GCGTCCTCTT CAGCGTAGCT CTCCACCTCT 
AATGTTATTT TTATAATTTC ATGGATTTAG 
GTTGACATTT CAGGCAAATT TGGCCAATAT 
CTAGGATTTC GCCAGTTCCT ACAACGTGCA 
ACTCTGCTCA GCTGTGTCCG TAGGAGTCGG 
ATATATCACT GAGGTATACT ATGCCATGTA 
TGGTTTAAAT TCAGAAGGAA ATAGATCAAG 
TAAATTCGTG TGACAAATAA TCATTTTCAT 
TTTGTGGTGT TTCTTTTTGA AAAGAAAAGC 
GCCCATTATG AAAGATGAAA TAAAGTATTC 
TGCGGCCGC 



SEQ ID NO: 3: 



AAACCAATGA 


ACCTGTGTGG 


GAGGAAAACT 


60 


AGGACCTTGA 


AGTTGAGGTC 


AGAGACGAGC 


120 


TCCCCCTCAG 


CCAGCTGCTC 


ACCAGTGAGG 


180 


GTAACTCGGG 


TCCAAACAGC 


ACCATCAAGA 


240 


^k *4i 4k 4k ^4 41 

AAAAG CG AG A 


AAGGCCTCCA 


GACCACCAAC 


300 


^*ft ^4 4k 4k. 4h 

CCAAAGAGGG 


GAGGAAAACA 


TCCATCAAAT 


360 


"^k ^4 4k & 4k 

GCAGCAACAC 


AGCTCCATCC 


ACACCAGTCA 


420 


AAAAGGCCCA 


GCCCCCTGAG 


GCCGGCCCTC 


480 


CCAGCCTCCT 


GGCCTCCCCA 


GGCCACATCT 


540 


CGGACATCTC 


GCTGCCCATC 


GCCACCCAGG 


600 


ACGGGACGAC 


CCTGGGACAG 


TCTCCACTGG 


660 


CG CAG AG AAA 


^•^i ^k 4k ^4 ^4 #4% 4k #4% 

CAAGCTTATC 


GTGGTCGTGC 


720 




TGACCCCTAT 


^4 A44 ^4 ^4 ^4 ^*a~ 4k mm dan 

GTCCGCATGT 


780 


CxIjAGGAAAAC 


ACACGTGTCA 


4k 4k ^4 ^BV 4k 4k 4Bl i«b, MH 

AAGAAAACAT 


840 


X CACv X GTTTC 


GTTACCAGAA 


GTGCAGAGGA 


900 


bLGGCTTCCT 


■^■i ^n^i *k ^k 4k ^4 ^k 

GTCCAAAGAC 


AAAGGGCTCC 


960 


AAGAACTTGC 


CAAAGGCTGG 


ACCCAGTGGT 


1020 


AGGCGATGAC 


ATAGCCGCAG 


CAGGCAGGAG 


1080 


ACCCGGAACA 


CACCCTCTCA 


CAGACGTACC 


1140 


TTATACATAC 


CTTAATAGTT 


TTATAAAATT 


1200 


TATCATTGAA 


TTTTCTGTGT 


TGGATTTCCT 


1260 


GTAGGGCGGC 


GGTAGCTCTT 


GTGTCTGTGG 


1320 


ATGTGTCTGT 


GCTTTATTAT 


GGCCTTGTTT 


1380 


AATAGACTAT 


TTTTTATAAT 


CTTAACATGC 


1440 


GAAATATATA 


TATTTTCTTC 


TAAAACTTAT 


1500 


CTTGGCAGCA 


AAAAGTTCTC 


AGTGACCTAT 


1560 


TGAAATATTA 


TTAAATGCTA 


GTATGTTTCT 


1620 


AAAATATTAA 


AAAAAAAAAA 


AAAAAATTCC 


1680 
1689 
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(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1505 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: 

GAATTCGGCA CGAGGAGCAG ATCTGCAAGA 
AGAACAACTA CCTTCGGGAA GAAGAGTGCA 

ST "™* 

6 it 

CTTTGAGAGG CAGCTCTGGG GCTCAGGCGA 
^0 GCCATCCAGT GTGCTCTGGC ACCTGTCAGC 

z » S 

T~ GCATCGACAG TTTCCTGGAG TGTGACGACA 

hi CTGCCTGTGA AAAATACACG AGTGGCTTTG 

^ ACAAAGGGCA CTGCGTGGAC CTGCCAGACA 

'I** GGTACTACAA CCCCTTCAGC GAACACTGCG 

£3 ACAAGAACAA CTTTGAGGAA GAGCAGCAGT 

5:1 AGGATGTGTT TGGCCTGAGG CGGGAAATCC 

E I I 

5 •S3' 

fy CTGTCGCAGT GTTCCTGGTC ATCTGCATTG 

p TCTTCAAGAA CCAGAGAAAG GACTTCCACG 

CCAGCTCCAC TGTCTCCACT ACCGAGGACA 
GGCCCCTCTG AGCCTGGGTC TCACCGGCTC 
CAGAGGCCTG GGCTGGGAAA AACTTTGGAA 
GTGCCTCAGA GACCAGGGCT CCAGCCCCTC 
GAGAAAGCTC AAAGGTTTGG AAGGAGCAGA 
TGGACGTGCC TGCATAGGAG TTTGGAGGAA 
GCCTGTCCCT ACCCCATGGT GCTAGGAAGA 
CCAACCCTGT CCTCCCGAGC TCCTCTTCCA 
TTCCCTGTGT AGTTTGTGCT GTAAAGAGTT 
GTGAAGAGGA GGGGAAGAGG CCTGTTTGGC 
ATTGAGCTCT CTGCCCTTGA TCAGCCCCAC 
GAAGCTCAGC TGCATTCCGC AGCCCCCACC 
CGCCCACTGG GTAATAAAAG TGGTTTGTGG 



SEQ ID NO:4: 



GTTTCGTTTA 


TGGAGGCTGC 


TTGGGCAACA 


60 


TTCTAGCCTG 


TCGGGGTGTG 


CAAGGTGGGC 


120 


CTTTCCCCCA 


GGGCCCCTCC 


ATGGAAAGGC 


180 


CCACCCAGTT 


CCGCTGCAGC 


AATGGCTGCT 


240 


CCCCCAACTG 


CCCCGACGCC 


TCCGACGAGG 


300 


ACGAGCTCCA 


GCGCATCCAT 


TTCCCCAGCG 


360 


CAGGACTCTG 


CAAGGAGAGC 


ATCCCGCGCT 


420 


CCCGCTTTAC 


CTATGGTGGT 


TGTTACGGCA 


480 


GCCTCGAGTC 


TTGTCGCGGC 


ATCTCCAAGA 


540 


CCATTCCCAG 


CACAGGCTCT 


GTGGAGATGG 


600 


TGGTGGTGGT 


AGCCATCTTG 


GGTTACTGCT 


660 


GACACCACCA 


CCACCCACCA 


CCCACCCCTG 


720 


CGGAGCACCT 


GGTCTATAAC 


CACACCACGC 


780 


TCACCTGGCC 


CTGCTTCCTG 


CTTGCCAAGG 


840 


CCAGACTCTT 


GCCTGTTTCC 


CAGGCCCACT 


900 


TTGGAGAAGT 


CTCAGCTAAG 


CTCACGTCCT 


960 


AAACCCTTGG 


GCCAGAAGTA 


CCAGACTAGA 


1020 


GTTGGAGTTT 


TGTTTCCTCT 


GTTCAAAGCT 


1080 


GGAGTGGGGT 


GGTGTCAGAC 


CCTGGAGGCC 


1140 


TGCTGTGCGC 


CCAGGGCTGG 


GAGGAAGGAC 


1200 


GCTTTTTGTT 


TATTTAATGC 


TGTGGCATGG 


1260 


CTCTCTATCC 


TCTCTTCCTC 


TTCCCCCAAG 


1320 


CCTGGCCTAG 


ACCAGCAGAC 


AGAGCCAGGA 


1380 


CCCAAGGTTC 


TCCAACATCA 


CAGCCCAGCC 


1440 


AAAAAAAAAA 


AAAAAAAAAA 


AAGTCCTGCG 


1500 
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GCCGC 



1505 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2002 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 



4.U 



it; 

fi\ ***** 



SJIKS 



GAATTCGGCA 


CGAGGGCCAT 


GGCCGGGCTA 


TCCCGCGGGT 


CCGCGCGCGC 


ACTGCTCGCC 


60 


GCCCTGCTGG 


CGTCGACGCT 


GTTGGCGCTG 


CTCGTGTCGC 


CCGCGCGGGG 


TCGCGGCGGC 


120 


CGGGACCACG 


GGGACTGGGA 


CGAGGCCTCC 


CGGCTGCCGC 


CGCTACCACC 


CCGCGAGGAC 


180 


GCGGCGCGCG 


TGGCCCGCTT 


CGTGACGCAC 


GTCTCCGACT 


GGGGCGCTCT 


GGCCACCATC 


240 


TCCACGCTGG 


AGGCGGTGCG 


CGGCCGGCCC 


TTCGCCGACG 


TCCTCTCGCT 


CAGCGACGGG 


300 


CCCCCGGGCG 


CGGGCAGCGG 


CGTGCCCTAT 


TTCTACCTGA 


GCCCGCTGCA 


GCTCTCCGTG 


360 


AGCAACCTGC 


AGGAGAATCC 


ATATGCTACA 


CTGACCATGA 


CTTTGGCACA 


GACCAACTTC 


420 


TGCAAGAAAC 


ATGGATTTGA 


TCCACAAAGT 


CCCCTTTGTG 


TTCACATAAT 


GCTGTCAGGA 1 


480 


ACTGTGACCA 


AGGTGAATGA 


AACAGAAATG 


GATATTGCAA 


AGCATTCGTT 


ATTCATTCGA 


540 


CACCCTGAGA 


TGAAAACCTG 


GCCTTCCAGC 


CATAATTGGT 


TCTTTGCTAA 


GTTGAATATA 


600 


ACCAATATCT 


GGGTCCTGGA 


CTACTTTGGT 


GGACCAAAAA 


TCGTGACACC 


AGAAGAATAT 


660 


TATAATGTCA 


CAGTTCAGTG 


AAGCAGACTG 


TGGTGAATTT 


AGCAACACTT 


ATGAAGTTTC 


720 


TTAAAGTGGC 


TCATACACAC 


TTAAAAGGCT 


TAATGTTTCT 


CTGGAAAGCG 


TCCCAGAATA 


780 


TTAGCCAGTT 


TTCTGTCACA 


TGCTGGTTTG 


TTTGCTTGCT 


TGTTTACTTG 


CTTGTTTACC 


840 


AATAGAGTTG 


ACCTGTTATT 


GGATTTCCTG 


GAAGATGTGG 


TAGCTACTTT 


TTTCCTATTT 


900 


TGAAGCCATT 


TTCGTAGAGA 


AATATCCTTC 


ACTATAATCA 


AATAAGTTTT 


GTCCCATCAA 


960 


TTCCAAAGAT 


GTTTCCAGTG 


GTGCTCTTGA 


AGAGGAATGA 


GTACCAGTTT 


TAAATTGCCC 


1020 


ATTGGCATTT 


GAAGGTAGTT 


GAGTATGTGT 


TCTTTATTCC 


TAGAAGCCAC 


TGTGCTTGGT 


1080 


AGAGTGCATC 


ACTCACCACA 


GCTGCCTCTT 


GAGCTGCCTG 


AGCCTGGTGC 


AAAAGGATTG 


1140 


GCCCCCATTA 


TGGTGCTTCT 


GAATAAATCT 


TGCCAAGATA 


GACAAACAAT 


GATGAAACTC 


1200 


AGATGGAGCT 


TCCTACTCAT 


GTTGATTTAT 


GTCTCACAAT 


CCTGGGTATT 


GTTAATTCAA 


1260 


CATAGGGTGA 


AACTATTTCT 


GATAAAGAAC 


TTTTGAAAAA 


CTTTTTATAC 


TCTAAAGTGA 


1320 


TACTCAGAAC 


AAAAGAAAGT 


CATAAAACTC 


CTGAATTTAA 


TTTCCCCACC 


TAAGTCGAGA 


1380 
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C3 
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CAGTATTATC AAAACACATG TGCACACAGA TTATTTTTTG GCTCCAAAAC TGGATTGCAA 1440 

AAGAAAGAGG AGAGATATTT TGTGTGTTCC TGGTATTCTT TTATAAGTAA AGTTACCCAG 1500 

GCATGGACCA GCTTCAGCCA GGGACAAAAT CCCCTCCCAA ACCACTCTCC ACAGCTTTTT 1560 

AAAAATACTT CTACTCTTAA CAATTACCTA AGGTTCCTTC AAACCCCCCC AACTCTTAAT 1620 

AGCTTCTAGT GCTGCTACAA TCTAAGTCAG GTCACCAGAG GGAAGAGAAC ATGGCATTAA 1680 

AAGAATCACA TCTTCAGAAG AGAAGACACT AATATTATTA CCCATATACA TGATTTCAGA 1740 

AGATGACATA AGATTCCTCT TAAAGAGGAA ATGTCAGGAA TCAAGCCACT GAATCCTTAA 1800 

AGAGAAAAGT TGAATATGAG TCATTGTGTC TGAAAACTGC AAAGTGAACT TAACTGAGAT 1860 

CCAGCAAACA GGTTCTGTTT AAGAAAAATA ATTTATACTA AATTTAGTAA AATGGACTTC 1920 

TTATTCAAAG CATCAATAAT TAAAAGAATT ATTTTAAAAA AAAAAAAAAA AAAAAAAAAA 1980 

AAAAAAAAAT TCCTGCGGCC GC 2002 

(2) INFORMATION FOR SEQ ID NO: 6: 



Q (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1322 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO:i 


6: 






GAATTCGGCA 


CGAGGGCCAC 


GACTCTGCTG 


GCATTTCTTC 


TATAGCCACT 


GGAATCTGAT 


60 


CCTGATTGTC 


TTCCACTACT 


ACCAGGCCAT 


CACCACTCCG 


CCTGGGTACC 


CACCCCAGGG 


120 


CAGGAATGAT 


ATCGCCACCG 


TCTCCATCTG 


TAAGAAGTGC 


ATTTACCCCA 


AGCCAGCCCG 


180 


AACACACCAC 


TGCAGCATCT 


GCAACAGGTG 


TGTGCTGAAG 


ATGGATCACC 


ACTGCCCCTG 


240 


GCTAAACAAT 


TGTGTGGGCC 


ACTATAACCA 


TCGGTACTTC 


TTCTCTTTCT 


GCTTTTTCAT 


300 


GACTCTGGGC 


TGTGTCTACT 


GCAGCTATGG 


AAGTTGGGAC 


CTTTTCCGGG 


AGGCTTATGC 


360 


TGCCATTGAG 


AAAATGAAAC 


AGCTCGACAA 


GAACAAACTA 


CAGGCGGTTG 


CCAACCAGAC 


420 


TTATCACCAG 


ACCCCACCAC 


CCACCTTCTC 


CTTTCGAGAA 


AGGATGACTC 


ACAAGAGTCT 


480 


TGTCTACCTC 


TGGTTCCTGT 


GCAGTTCTGT 


GGCACTTGCC 


CTGGGTGCCC 


TAACTGTATG 


540 


GCATGCTGTT 


CTCATCAGTC 


GAGGTGAGAC 


TAGCATCGAA 


AGGCACATCA 


ACAAGAAGGA 


600 


GAGACGTCGG 


CTACAGGCCA 


AGGGCAGAGT 


ATTTAGGAAT 


CCTTACAACT 


ACGGCTGCTT 


660 


GGACAACTGG 


AAGGTATTCC 


TGGGTGTGGA 


TACAGGAAGG 


CACTGGCTTA 


CTCGGGTGCT 


720 


CTTACCTTCT 


ACTCACTTGC 


CCCATGGGAA 


TGGAATGAGC 


TGGGAGCCCC 


CTCCCTGGGT 


780 



33 



GACTGCTCAC 


TCAGCCTCTG 


TGATGGCAGT 


GTGAGCTGGA 


CTGTGTCAGC 


CACGACTCGA 


840 


GCACTCATTC 


TGCTCCCTAT 


GTTATTTCAA 


GGGCCTCCAA 


GGGCAGCTTT 


TCTCAGAATC 


900 


CTTGATCAAA 


AAGAGCCAGT 


GGGCCTGCCT 


TAGGGTACCA 


TGCAGGACAA 


TTCAAGGACC 


960 


AGCCTTTTTA 


CCACTGCAGA 


AGAAAGACAC 


AATGTGGAGA 


AATCTTAGGA 


CTGACATCCC 


1020 


TTTACTCAGG 


CAAACAGAAG 


TTCCAACCCC 


AGACTAGGGG 


TCAGGCAGCT 


AGCTACCTAC 


1080 


CTTGCCCAGT 


GCTGACCCGG 


ACCTCCTCCA 


GGATACAGCA 


CTGGAGTTGG 


CCACCACCTC 


1140 


TTCTACTTGC 


TGTCTGAAAA 


AACACCTGAC 


TAGTACAGCT 


GAGATCTTGG 


CTTCTCAACA 


1200 


GGGCAAAGAT 


ACCAGGCCTG 


CTGCTGAGGT 


CACTGCCACT 


TCTCACATGC 


TGCTTAAGGG 


1260 


AGCACAAATA 


AAGGTATTCG 


ATTTTTAAAA 


AAAAAAAAAA 


AAAAAAAAAT 


TCCTGCGGCC 


1320 


GC 












1322 



(2) INFORMATION FOR SEQ ID NO: 7: 



S" ■-- t 

i s I 
ay 

n 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 7 








GAATTCGGCA 


CGAGGAGCCT 


GCCTTCATCT 


AGGATGGCTC 


CTCTGGGCAT 


GCTGCTTGGG 


60 


CTGCTGATGG 


CCGCCTGCTT 


CACCTTCTGC 


CTCAGTCATC 


AGAACCTGAA 


GGAGTTTGCC 


120 


CTGACCAACC 


CAGAGAAGAG 


CAGCACCAAA 


GAAACAGAGA 


GAAAAGAAAC 


CAAAGCCGAG 


180 


GAGGAGCTGG 


ATGCCGAAGT 


CCTGGAGGTG 


TTCCACCCGA 


CGCATGAGTG 


GCAGGCCCTT 


240 


CAGCCAGGGC 


AGGCTGTCCC 


TGCAGGATCC 


CACGTACGGC 


TGAATCTTCA 


GACTGGGGAA 


300 


AGAGAGGCAA 


AACTCCAATA 


TGAGGACAAG 


TTCCGAAATA 


ATTTGAAAGG 


CAAAAGGCTG 


360 


GATATCAACA 


CCAACACCTA 


CACATCTCAG 


GATCTCAAGA 


GTGCACTGGC 


AAAATTCAAG 


420 


GAGGGGGCAG 


AGATGGAGAG 


TTCAAAGGAA 


GACAAGGCAA 


GGCAGGCTGA 


GGTAAAGCGG 


480 


CTCTTCCGCC 


CCATTGAGGA 


ACTGAAGAAA 


GACTTTGATG 


AGCTGAATGT 


TGTCATTGAG 


540 


ACTGACATGC 


AGATCATGGT 


ACGGCTGATC 


AACAAGTTCA 


ATAGTTCCAG 


CTCCAGTTTG 


600 


GAAGAGAAGA 


TTGCTGCGCT 


CTTTGATCTT 


GAATATTATG 


TCCATCAGAT 


GGACAATGCG 


660 


CAGGACCTGC 


TTTCCTTTGG 


TGGTCTTCAA 


GTGGTGATCA 


ATGGGCTGAA 


CAGCACAGAG 


720 


CCCCTCGTGA 


AGGAGTATGC 


TGCGTTTGTG 


CTGGGCGCTG 


CCTTTTCCAG 


CAACCCCAAG 


780 


GTCCAGGTGG 


AGGCCATCGA 


AGGGGGAGCC 


CTGCAGAAGC 


TGCTGGTCAT 


CCTGGCCACG 


840 
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GAGCAGCCGC 


TCACTGCAAA 


GAAGAAGGTC 


CTGTTTGCAC 


Ivl V»x* X UUw x 


Ox- x Ij^uLvAt 


yuo 


TTCCCCTATG 


CCCAGCGGCA 


GTTCCTGAAG 


X^ X WwWWw X* 


X uUnuw X X 


xvAxvWrACCCTG 


960 


GTGCAGGAGA 


AGGGCACGGA 


GGTGCTCGCC 


GTG CGCGTGG 

X» X VVrWU X VJVJ 


x unv»rtv x x 


A AiAj ACC TG 


1020 


GTCACGGAGA 


AGATGTTCGC 


CGAGGAGGAG 


G CTG AG GTG A 

x*x^ x Njr»\Jw x vjrv 




x» X LLLtAUAG 


1080 


AAGCTGCAGC 


AGTATCGCCA 


GGTACACCTC 


CTGCCAGGCC 

x* x wvnwwVw 


TGTGGG A AO A 




1140 


GAGATCACGG 


CCCACCTCCT 


GGCGCTGCCC 


GAG C ATG ATG 


CCCGTG AG A A 

x*fX*X*X* X VJ**VJxU^. 


vu X Vjx<> X uLAu 




ACACTGGGCG 


TCCTCCTGAC 


CACCTGCCGG 


GACCGCTACC 


w X x^AWxarAxox^x^ 






AGGACACTGG 


v<^/luv^^ x \j\^c\. 


1 bAb 1 AC 


CAGGTGCTGG 


CCAGCCTGGA 


GCTGCAGGAT 


1320 


GGTGAGGACG 


AGGGCTACTT 


CCAGGAGCTG 


CTGGGCTCTG 


TCAACAGCTT 


GCTGAAGGAG 


1380 


CTGAGATGAG 


GCCCCACACC 


AGGACTGGAC 


TGGGATGCCG 


CTAGTGAGGC 


TGAGGGGTGC 


1440 


CAGCGTGGGT 


GGGCTTCTCA 


GGCAGGAGGA 


CATCTTGGCA 


GTGCTGGCTT 


GGCCATTAAA 


1500 


TGGAAACCTG 


AAGGCCAAAA 


AAAAAAAAAA 


AAAAAAAAAA 


AAAAAAAAAA 


AAAAAAAAAA 


1560 


TTCCTGCGGC 


CGC 










1573 



'^1 (2) INFORMATION FOR SEQ ID NO: 8: 

z „ s 

^ * ** 

in 

Id (i) SEQUENCE CHARACTERISTICS: 

^ (A) LENGTH: 1185 base pairs 

(B) TYPE: nucleic acid 
O (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

PU 

r t 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



GAATTCGGCA 


CGAGGGGGCT 


TTAAGGGACA 


GCTGAGCCGG 


CAGGTGGCAG 


ATCAGATGTG 


60 


GCAGGCTGGG 


AAAAGACAAG 


CCTCCAGGGC 


CTTCAGCTTG 


TACGCCAACA 


TCGACATCCT 


120 


CAGACCCTAC 


TTTGATGTGG 


AGCCTGCTCA 


GGTGCGAAGC 


AGGCTCCTGG 


AGTCCATGAT 


180 


CCCTATCAAG 


ATGGTCAACT 


TCCCCCAGAA 


AATTGCAGGT 


GAACTCTATG 


GACCTCTCAT 


240 


GCTGGTCTTC 


ACTCTGGTTG 


CTATCCTACT 


CCATGGGATG 


AAGACGTCTG 


ACACTATTAT 


300 


CCGGGAGGGC 


ACCCTGATGG 


GCACAGCCAT 


TGGCACCTGC 


TTCGGCTACT 


GGCTGGGAGT 


360 


CTCATCCTTC 


ATTTACTTCC 


TTGCCTACCT 


GTGCAACGCC 


CAGATCACCA 


TGCTGCAGAT 


420 


GTTGGCACTG 


CTGGGCTATG 


GCCTCTTTGG 


GCATTGCATT 


GTCCTGTTCA 


TCACCTATAA 


480 


TATCCACCTC 


CACGCCCTCT 


TCTACCTCTT 


CTGGCTGTTG 


GTGGGTGGAC 


TGTCCACACT 


540 


GCGCATGGTA 


GCAGTGTTGG 


TGTCTCGGAC 


CGTGGGCCCC 


ACACAGCGGC 


TGCTCCTCTG 


600 


TGGCACCCTG 


GCTGCCCTAC 


ACATGCTCTT 


CCTGCTCTAT 


CTGCATTTTG 


CCTACCACAA 


660 
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AGTGGTAGAG 


GGGATCCTGG 


ACACACTGGA 


GGGCCCCAAC 


ATCCCGCCCA 


TCCAGAGGGT 


720 


CCCCAGAGAC 


ATCCCTGCCA 


TGCTCCCTGC 


TGCTCGGCTT 


CCCACCACCG 


TCCTCAACGC 


780 


CACAGCCAAA 


GCTGTTGCGG 


TGACCCTGCA 


GTCACACTGA 


CCCCACCTGA 


AATTCTTGGC 


840 


CAGTCCTCTT 


TCCCGCAGCT 


GCAGAGAGGA 


GGAAGACTAT 


TAAAGGACAG 


TCCTGATGAC 


900 


ATGTTTCGTA 


GATGGGGTTT 


GCAGCTGCCA 


CTGAGCTGTA 


GCTGCGTAAG 


TACCTCCTTG 


960 


ATGCCTGTCG 


GCACTTCTGA 


AAGGCACAAG 


GCCAAGAACT 


CCTGGCCAGG 


ACTGCAAGGC 


1020 


TCTGCAGCCA 


ATGCAGAAAA 


TGGGTCAGCT 


CCTTTGAGAA 


CCCCTCCCCA 


CCTACCCCTT 


1080 


CCTTCCTCTT 


TATCTCTCCC 


ACATTGTCTT 


GCTAAATATA 


GACTTGGTAA 


TTAAAATGTT 


1140 


GATTGAAGTC 


TGGAAAAAAA 


AAAAAAAAAA 


AATTCCTGCG 


GCCGC 




1185 



(2) INFORMATION FOR SEQ ID NO: 9 5 

fs% (i) SEQUENCE CHARACTERISTICS: 
M3 (A) LENGTH: 1226 base pairs 

H (B) TYPE: nucleic acid 

i*S (C) STRANDED NESS : single 

W (D) TOPOLOGY: linear 

^ PS 

f ^ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



GAATTCGGCA 


CGAGGCAAGC 


CACCATCTTC 


CTTCGGCCTG 


CACCCCTTTA 


AAGGCACCCA 


60 


GACCCCTCTG 


GAAAAAGATG 


AACTGAAGCC 


CTTTGACATC 


CTCCAGCCTA 


AGGAGTACTT 


120 


CCAGCTCAGC 


CGCCACACGG 


TCATTAAGAT 


GGGAAGTGAG 


AACGAGGCCC 


TGGATCTCTC 


180 


CATGAAGTCA 


GTGCCCTGGC 


TCAAGGCTGG 


TGAAGTCAGT 


CCCCCAATCT 


TCCAGGAAGA 


240 


TGCAGCCCTA 


GACCTGTCAG 


TGGCAGCCCA 


CCGGAAATCC 


GAGCCTCCCC 


CTGAGACACT 


300 


GTATGACAGT 


GGTGCATCAG 


TGGACAGCTC 


AGGTCACACA 


GTGATGGAGA 


AACTTCCCAG 


360 


TGGCATGGAA 


ATTTCTTTTG 


CCCCTGCCAC 


GTCCCATGAG 


GCCCCAGCCA 


TGATGGATAG 


420 


TCACATCAGC 


AGCAGTGATG 


CTGCTACCGA 


GATGCTCAGC 


CAGCCCAACC 


ACCCCAGCGG 


480 


CGAAGTCAAG 


GCTGAAAATA 


ACATTGAGAT 


GGTGGGCGAG 


TCCCAGGCGG 


CCAAGGTCAT 


540 


TGTCTCTGTC 


GAAGATGCTG 


TGCCTACCAT 


ATTCTGTGGC 


AAGATCAAAG 


GCCTCTCAGG 


600 


GGTGTCCACC 


AAAAACTTCT 


CCTTCAAAAG 


AGAAGACTCC 


GTGCTTCAGG 


GCTATGACAT 


660 


CAACAGCCAA 


GGGG AAGAGT 


CCATGGGAAA 


TGCAGAGCCC 


CTTAGGAAAC 


CCATCAAAAA 


720 


CCGGAGCATA 


AAGTTAAAGA 


AAGTGAACTC 


CCAGGAAGTA 


CACATGCTCC 


CAATCAAAAA 


780 


ACAACGGCTG 


GCCACCTTTT 


TTCCAAGAAA 


GTAAATAACG 


GCTTTTTAAA 


ATTTGTATGA 


840 


TTATAATATG 


GGGAAAGGTG 


CATTGGTTTT 


ATAAAAAGGC 


ATTTAAAACA 


AATTATCTTT 


900 
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% * e 



1:5 ! 



Pi I 



S 



GTTAATTATT 


TTGGGGAGTA 


GTTGGGAAAT 


GGAAAGGTGA 


ATTGGCTCTA 


GAGGCCCTGT 


960 


ATGCTAGTAT 


CATTTTCTTT 


TTTAATTTTT 


GACTTTTCAC 


AAATGAGTAA 


ATAAGAGCAA 


1020 


CCTATTTTTC 


AAGCAGATTG 


CACATTTTTT 


GCAGCTTTAA 


TGGAATATTG 


GGTGAATTAG 


1080 


AGGGGTAAAA 


AAAGCTATTT 


TCATTGCCAC 


AAAGTGCTTT 


GATGATGTAA 


TACCTAATAA 


1140 


AGGGTAGGAT 


GAATATTTCA 


CAATAAATGT 


TTGTTTGCAC 


TAAAAAAAAA 


AAAAAAAAAA 


1200 


AAAAAAAAAA 


AAATTCCTGC 


GGCCGC 








1226 



(2) INFORMATION FOR SEQ ID NO: 10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1049 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



GAATTCGGCA 


CGAGGGCGCC 


ATGGTGAAGG 


TGACGTTCAA 


CTCCGCTCTG 


GCCCAGAAGG 


60 


AGGCCAAGAA 


GGACGAGCCC 


AAGAGCGGCG 


AGGAGGCGCT 


CATCATCCCC 


CCCGACGCCG 


120 


TCGCGGTGGA 


CTGCAAGGAC 


CCAGATGATG 


TGGTACCAGT 


TGGCCAAAGA 


AGAGCCTGGT 


180 


GTTGGTGCAT 


GTGCTTTGGA 


CTAGCATTTA 


TGCTTGCAGG 


TGTTATTCTA 


GGAGGAGCAT 


240 


ACTTGTACAA 


ATATTTTGCA 


CTTCAACCAG 


ATGACGTGTA 


CTACTGTGGA 


ATAAAGTACA 


300 


TCAAAGATGA 


TGTCATCTTA 


AATGAGCCCT 


CTGCAGATGC 


CCCAGCTGCT 


CTCTACCAGA 


360 


CAATTGAAGA 


AAATATTAAA 


ATCTTTGAAG 


AAGAAGAAGT 


TGAATTTATC 


AGTGTGCCTG 


420 


TCCCAGAGTT 


TGCAGATAGT 


GATCCTGCCA 


ACATTGTTCA 


TGACTTTAAC 


AAGAAACTTA 


480 


CAGCCTATTT 


AGATCTTAAC 


CTGGATAAGT 


GCTATGTGAT 


CCCTCTGAAC 


ACTTCCATTG 


540 


TTATGCCACC 


CAGAAACCTA 


CTGGAGTTAC 


TTATTAACAT 


CAAGGCTGGA 


ACCTATTTGC 


600 


CTCAGTCCTA 


TCTGATTCAT 


GAGCACATGG 


TTATTACTGA 


TCGCATTGAA 


AACATTGATC 


660 


ACCTGGGTTT 


CTTTATTTAT 


CGACTGTGTC 


ATGACAAGGA 


AACTTACAAA 


CTGCAACGCA 


720 


GAGAAACTAT 


TAAAGGTATT 


CAGAAACGTG 


AAGCCAGCAA 


TTGTTTCGCA 


ATTCGGCATT 


780 


TTGAAAACAA 


ATTTGCCGTG 


GAAACTTTAA 


TTTGTTCTTG 


AACAGTCAAG 


AAAAACATTA 


840 


TTGAGGAAAA 


TTAATATCAC 


AGCATAACCC 


CACCCTTTAC 


ATTTTGTTGC 


AGTTGATTAT 


900 


TTTTTAAAGT 


CTTCTTTCAT 


GTAAGTAGCA 


AACAGGGCTT 


TACTATCTTT 


TCATCTCATT 


960 


AATTCAATTA 


AAACCATTAC 


CTTAAAAAAA 


AAAAAAAAAA 


AAAAAAAAAA 


AAAAAAAAAA 


1020 


AAAAAAAAAA 


AAAAAATTCC 


TGCGGCCGC 








1049 
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(2) INFORMATION FOR SEQ ID NO: 11 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1142 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



GAATTCGGCA 


CGAGGGGAGA 


ATACTTTTTG 


CGATGCCTAC 


TGGAGACTTT 


GATTCGAAGC 


CCAGTTGGGC 


CGACCAGGTG 


GAGGAGGAGG 


GGGAGGACGA 


CAAATGTGTC 


ACCAGCGAGC 


TCCTCAAGGG 


GATCCCTCTG 


GCCACAGGTG 


ACACCAGCCC 


AGAGCCAGAG 


CTACTGCCGG 


GAGCTCCACT 


GCCGCCTCCC 


AAGGAGGTCA 


TCAACGGAAA 


CATAAAGACA 


GTGACAGAGT 


ACAAGATAGA 


TGAGGATGGC 


AAGAAGTTCA 


AGATTGTCCG 


CACCTTCAGG 


ATTGAGACCC 


GGAAGGCTTC 


AAAGGCTGTC 


GCAAGGAGGA 


AGAACTGGAA 


GAAGTTCGGG 


AACTCAGAGT 


TTGACCCCCC 


CGGACCCAAT 


GTGGCCACCA 


CCACTGTCAG 


TGACGATGTC 


TCTATGACGT 


TCATCACCAG 


CAAAGAGGAC 


CTGAACTGCC 


AGGAGGAGGA 


GGACCCTATG 


AACAAATTCA 


AGGGCCAGAA 


GATCGTGTCC 


TGCCGCATCT 


GCAAGGGCGA 


CCACTGGACC 


ACCCGCTGCC 


CCTACAAGGA 


TACGCTGGGG 


CCCATGCAGA 


AGGAGCTGGC 


CGAGCAGCTG 


GGCCTGTCTA 


CTGGCGAGAA 


GGAGAAGCTG 


CCGGGAGAGC 


TAGAGCCGGT 


GCAGGCCACG 


CAGAACAAGA 


CAGGGAAGTA 


TGTGCCGCCG 


AGCCTGCGCG 


ACGGGGCCAG 


CCGCCGCGGG 


GAGTCCATGC 


AGCCCAACCG 


CAGAGCCGAC 


GACAACGCCA 


CCATCCGTGT 


CACCAACTTG 


CGCAGAGGAC 


ACGCGTGAGA 


CCGACCTGCA 


GGAGCTCTTC 


CGGCCTTTCG 


GCTCCATCTC 


CCGCATCTAC 


CTGGCTAAGG 


ACAAGACCAC 


TGGCCAATCC 


AAGGGCTTTG 


CCTTCATCAG 


CTTCCACCGC 


CGCGAGGATG 


CTGCGCGTGC 


CATTGCCGGG 


GTGTCCGGCT 


TTGGCTACGA 


CCACCTCATC 


CTCAACGTCG 


AGTGGGCCAA 


GCCGTCCACC 


AACTAAGCCA 


GCTGCCACTG 


TGTACTCGGT 


CCGGGACCCT 


TGGCGACAGA 


AGACAGCCTC 


CGAGAGCGCG 


GGCTCCAAGG 


GCAATAAAGC 


AGCTCCACTC 
GC 


TCAAAAAAAA 


AAAAAAAAAA 


AAAAAAAAAA 


AAAAAAAAAT 


TCCTGCGGCC 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1696 base pairs 
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<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: 

GAATTCGGCA CGAGGGAAAC ATGGCGGTAG 
GAAGGAAGAG GAAGGTTTTC CTGAAGATGA 
TGGTAAAAGA GTTGGATGCC TTTCCGAAGG 
GTGGAGGTAC AGTTTCTCTA ATAGCATTTA 
TCTCAGTATA TCAAGATACA TGGATGAAGT 
GCAAATTAAG AATTAATATA GATATTACTG 
kp ATGTATTGGA TTTAGCAGAA ACAATGGTTG 

^ y CAGTATTTGA TCTTTCACCA CAGCAGAAAG 

tjj 

GTAGGCTACA AGAAGAGCAT TCACTTCAAG 
W CATCAACAGC TCTTCCACCA AGAGAAGATG 

m TTCATGGCCA TCTATATGTC AATAAAGTAG 

S CAATTCCACA TCCTCGTGGT CATGCACATT 

^ ATTTTTCTCA TAGAATAGAT CATTTGTCTT 

fy CTTTAGATGG AACTGAAAAA ATTGCTATAG 

fU CAGTTGTGCC AACAAAACTA CATACATATA 

f]f TGACAGAAAG GGAACGTATC ATTAACCATG 

TTATGAAATA TGATCTCAGT TCTCTTATGG 
GGCAGTTTTT TGTAAGACTC TGTGGTATTG 
TACATGGAAT TGGAAAATTT ATAGTTGAAA 
ATAAACCTGT CAATTCTGTT CCTTTTGAGG 
TAGAAAATAA TACACATTAA CACCTCCCGA 
TAAAACCTTT TTTTAATAAT AAAATATTGT 
TAAGCAGAAA ACATACTTAT TTTAAAAAAG 
ATTCTATATA CGTTGTGTCT GTTACAAATG 
AAGAAGCCCA ACTGGAGTGT TGCTTTGAAG 
GGGTGGTATC AAAATCAGAC ATTGCTTCTT 
AACTACATCT ATGGGAAAAA AAAAAACATT 
TTTAAAAGAT ATGATGTCAG AATAAAATGT 
AAATTCCTGC GGCCGC 



OC>U XL* ViKJ I 


I*. : 






utl (jGGACCA 


fYl "JV TV ^1 Ik IS m ^1 ^* 

TAACACAAGC 


ATGACTATAT 


60 


f* f* f**(~* TV f-**T>r* TV TV 

V»Vjv-LjA(- X GAA 


#n ^4j^4j 4k n 4* 4k 4k 4k 

TCGGAAAAAA 


ACTTTAAGTT 


120 


TTOPTP TV TV 

X X X GAGAG 


CTATGTAGAG 


ACTTCAGCCA 


180 


pTVTl /"'•rp tv rp/-t « « 
LAAL 1 Al VrGl* 


I l xATTAACC 


ATAATGGAAT 


240 


A X unA X AV*V> A 


TV T* TV TV 7V TV y"i 

AGTAGACAAG 


GATTTTTCTA 


300 


X X LfOL^A 1 (jAA 


cnr^ TT\f*% TV TV m TV m 

GTGTCAATAT 


GTTGGAGCGG 


360 


f»TV TPTPPh^R 


flr»^^7 ^^K^ft ^77 7k ^^ft.^44 

TGGTTTAGTT 


TATGAACCAA 


420 


AG X GGWAGaG 


GATGCTGCAG 


CTGATTCAGA 


480 


AlblbAlATT 


7k 7k 7k 7k 

TAAAAGTGCT 


TTTAAAAGTA 


540 


nil l*A X 1*>AI*>A 


O ^fT^f TV TV TV m 

blUICCAAAT 


GCATGCAGAA 


4* 0\. r*. 

600 


VarnwvrvxftA X X X 


X LfALA X AAwA 


GTGGG C AAGG 


660 


TGGCAGPAPT 


IVjI vAALVfA X 


o Tv Tv cri/~»*T^nrt tv t\ 
v/AA ICl T AC A 


•7 O f\ 

120 


TTGGAGAGPT 


TGTTPPAflPa 


7k fl^ff^Tk tTlfT^ Tk 7k fll^^J 

AX XAX XAATC 


~i o r\ 

780 


AT CACAACCA 


wA X vj X X UL>nA 


T TV fr*rrifTifTi TV nun t\ 
XAX X X IAiTA 


840 


AAATATPAGP 


TAP* 7AOTA P**"*/"^ 7a *P 


WAG x TTTCTG 


900 


CTG CAGGCAG 


VrV<A X VjVjAVj X l^r 


ICl OljO A x Ax 


O C F\ 


TGACAGTTAC 


X V» AVJ\j AVJ AVd' 




1020 


TTGGAGGAAT 


CTTTTCAACA 


A\«A VJVJ V/A X w X 


1UOU 


TAATTTGCTG 


TCGTTTCAGA 


CTTGGATCCT 


1140 


ATGGCCACAC 


AGACAACCAC 


TTACCTCTTT 


1200 


TTGAAGGAGA 


AAAACTTTTT 


GCCTGAGACA 


1260 


GCAATATATT 


CAAAGAAAAG 


AAAACACAAA 


1320 


AAAAAAAAGG 


ATAAAAAAAC 


CCAAACTGAA 


1380 


TCGTAGAAGA 


AATCATGCAG 


CTAAACGATG 


1440 


ATGACGCCTT 


CTTATATTTT 


CATAGCAAAT 


1500 


GCTGATAAAA 


AGCCTGAAGG 


AAATAAGTGA 


1560 


GAGAAGTGCA 


AATGTTCGCA 


TCCTTTTGTT 


1620 


GGAAAACATA 


CGGAAAAAAA 


AAAAAAAAAA 


1680 








1696 
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(2) INFORMATION FOR SEQ ID NO: 13: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1100 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 





GAATTCGGCA 


CGAGGCGGCA 


CGAGGCGGCA 


CGAGGGTGGC 


ATATCACGGC 


CATGGGGTCT 


60 


EST, 


CAGCATTCCG 


CTGCTGCTCG 


CCCCTCCTCC 


TGCAGGCGAA 


AGCAAGAAGA 


TGACAGGGAC 


120 




GGTTTGCTGG 


CTGAACGAGA 


GCAGGAAGAA 


GCCATTGCTC 


AGTTCCCATA 


TGTGGAATTC 


180 


t : 


ACCGGGAGAG 


ATAGCATCAC 


CTGTCTCACG 


TGCCAGGGGA 


CAGGCTACAT 


TCCAACAGAG 


240 


f 
--* 

Ui 

e _ t 


CAAGTAAATG 


AGTTGGTGGC 


TTTGATCCCA 


CACAGTGATC 


AGAGATTGCG 


CCCTCAGCGA 


300 


ACTAAGCAAT 


ATGTCCTCCT 


GTCCATCCTG 


CTTTGTCTCC 


TGGCATCTGG 


TTTGGTGGTT 


360 




TTCTTCCTGT 


TTCCGCATTC 


AGTCCTTGTG 


GATGATGACG 


GCATCAAAGT 


GGTGAAAGTC 


420 




ACATTTAATA 


AGCAAGACTC 


CCTTGTAATT 


CTCACCATCA 


TGGCCACCCT 


GAAAATCAGG 


480 




AACTCCAACT 


TCTACACGGT 


GGCAGTGACC 


AGCCTGTCCA 


GCCAGATTCA 


GTACATGAAC 


540 


F*V H * 

ru 


ACAGTGGTCA 


GTACATATGT 


GACTACTAAC 


GTCTCCCTTA 


TTCCACCTCG 


GAGTGAGCAA 


600 




CTGGTGAATT 


TTACCGGGAA 


GGCCGAGATG 


GGAGGACCGT 


TTTCCTATGT 


GTACTTCTTC 


660 




TGCACGGTAC 


CTGAGATCCT 


GGTGCACAAC 


ATAGTGATCT 


TCATGCGAAC 


TTCAGTGAAG 


720 


f 


ATTTCATACA 


TTGGCCTCAT 


GACCCAGAGC 


TCCTTGGAGA 


CACATCACTA 


TGTGGATTGT 


780 




GGAGGAAATT 


CCACAGCTAT 


TTAACAACTG 


CTATTGGTTC 


TTCCACACAG 


CGCCTGTAGA 


840 




AGAGAGCACA 


GCATATGTTC 


CCAAGGCCTG 


AGTTCTGGAC 


CTACCCCCAC 


GTGGTGTAAG 


900 




CAGAGGAGGA 


ATTGGTTCAC 


TTAACTCCCA 


GCAAACATCC 


TCCTGCCACT 


TAGGAGGAAA 


960 




CACCTCCCTA 


TGGTACCATT 


TATGTTTCTC 


AGAACCAGCA 


GAATCAGTGC 


CTAGCCTGTG 


1020 




CCCAGCAAAT 


AGTTGGCACT 


CAATAAAGAT 


TTGCAGAATT 


TAAAAAAAAA 


AAAAAAAAAA 


1080 




AAAAAAATTC 


CTGCGGCCGC 










1100 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1588 base pairs 

(B) TYPE: nucleic acid 
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(C) 
<D) 



STRANDEDNESS : single 
TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION : 



GAATTCGGCA CGAGGGTACC TGCTTTTCTA 

CCATGTTCCC TACTCGGCTC TCACCATGTT 

TGCCACCGCC TATCGGATGA CTGTGGAAGT 

GGGACAAATC GTGGGCCAAG CAGACACGCC 

AGCTTCACAA AGTGCCAACC ATACACATGG 

ATACCTGCTG GCAGCGGGGG TCATTGTCTG 

CCTGGGCGTG CGGGAGCAGA GAGAACCCTA 

fc q CTTCCGGGGC CTACGGCTGG TCATGAGCCA 

%Q CCTCTTCACC TCCTTGGCTT TCATGCTGGT 

£ . I 

^ CACCTTGGGC TTCCGCAATG AATTCCAGAA 

1J TTTAACCATT CCCATCTGGC AGTGGTTCTT 

TGTTGGGATC TCATCAGCAG TGCCATTTCT 
CATCATTACA TATGCGGTAG CTGTGGCAGC 
£□ ACCCTGGTCC ATGCTGCCTG ATGTCATTGA 

S?1 TGGAACCGAG CCCATCTTCT TCTCCTTCTA 

iU 

fU GTCACTGGGC ATTTCTACCC TCAGTCTGGA 

H GCAGCCGGAA CGTGTCAAGT TTACACTGAA 

CATCCTGCTG GGCCTGCTGC TCTTCAAAAT 

GAATAAGAAG GCCCTGCAGG CACTGAGGGA 

AGACTCCACA GAGCTGGCTA GCATCCTCTA 

CAGAAGGCCA CAGAAGGGAT CAGGACCTGT 

GTGCTAGGAA GGGAACTGAA GACTCAAGGA 

TGGGGCCGGC TGCTCTGTGG CCTCCTGCCT 

GGGGCTGCCA CTGTGAATAT GCCAAGGACT 

AAACCTTTTT TTTACAGAGC CTAATTAATA 

TGTATGTATA TGTCTGTGAG CTATTAATGT 

AAAAAAAAAA AAAAATTCCT GCGGCCGC 



(2) INFORMATION FOR SEQ 



SEQ ID NO: 14: 



TTGCCTCTTT 


GAAACAATGG 


TCACGTGTTT 


60 


CATCAGCACC 


GAGCAGACTG 


AGCGGGATTC 


120 


GCTGGGCACA 


GTGCTGGGCA 


CGGCGATCCA 


180 


TTGTTTCCAG 


GACCTCAATA 


GCTCTACAGT 


240 


CACCACCTCA 


CACAGGGAAA 


CGCAAAAGGC 


300 


TATCTATATA 


ATCTGTGCTG 


TCATCCTGAT 


360 


TGAAGCCCAG 


CAGTCTGAGC 


CAATCGCCTA 


420 


CGGCCCATAC 


ATCAAACTTA 


TTACTGGCTT 


480 


GGAGGGGAAC 


TTTGTCTTGT 


TTTGCACCTA 


540 


TCTACTCCTG 


GCCATCATGC 


TCTCGGCCAC 


600 


GACCCGGTTT 


GGCAAGAAGA 


CAGCTGTATA 


660 


CATCTTGGTG 


GCCCTCATGG 


AGAGTAACCT 


720 


TGGCATCAGT 


GTGGCAGCTG 


CCTTCTTACT 


780 


CGACTTCCAT 


CTGAAGCAGC 


CCCACTTCCA 


840 


TGTCTTCTTC 


ACCAAGTTTG 


CCTCTGGAGT 


900 


CTTTGCAGGG 


TACCAGACCC 


GTGGCTGCTC 


960 


CATGCTCGTG 


ACCATGGCTC 


CCATAGTTCT 


1020 


GTACCCCATT 


GATGAGGAGA 


GGCGGCGGCA 


1080 


CGAGGCCAGC 


AGCTCTGGCT 


GCTCAGAAAC 


1140 


GGGCCCGCCA 


CGTTGCCCGA 


AGCCACCATG 


1200 


CTGCCGGCTT 


GCTGAGCAGC 


TGGACTGCAG 


1260 


GGTGGCCCAG 


GACACTTGCT 


GTGCTCACTG 


1320 


CCCCTCTGCC 


TGCCTGTGGG 


GCCAAGCCCT 


1380 


GATCGGGCCT 


AGCCCGGAAC 


ACTAATGTAG 


1440 


ACTTAATGAC 


TGTGTACATA 


GCAATGTGTG 


1500 


TATTAATTTT 


CATAAAAGCT 


GGAAAGCAAA 


1560 
1588 



ID NO: 15 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1535 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: 



GAATTCGGCA CGAGGCGGAA GTCCCGTCTC 
GAGTCGGCAG CCCTGTGGCA GCCGGCGGGC 
CCAGCTGCTG CATCCCATGG CCAGGGGTGG 
f ^ AGGCCTGAAC CTGGGGCCAG ACACCCTGCT 

i;Q CCTGCCTTAC TGTGGGCCCA GGAGGTGGGC 

CTGCTGCAGT TTGGGGTGCT CTTCTGCACC 

^y 

Ifs CTCTATGGCT CCTTCTACTA TTCCTATATG 

■a? i 

|y TTCTACTACA GGACCGACTG TGATTCCTCC 

itt AATGTCTCGC TGACTAAGGG TGGACGTGAT 

i i 

B GTTACCTTAG AGCTTGAGCT GCCAGAGTCC 

y GTCACCATTT CCTGCTACAC CAGAGGTGGC 

■ W H 

fss ATGCTGCATT ACCGCTCAGA CCTGCTCCAG 

e 3-5 

fU CTGCTATTTG GCTTTGCAGA GCAGAAGCAG 

*i AGAGAGAACT CGTACGTGCC GACCACTGGA 

CAGCTGTATG GAGCCTACCT CCGCATCCAC 
TACAACTTCC CGATGACCTG CGCCTTCATA 
GTCATCGTGC TCTTCAGCTA CATGCAGTGG 
TTCTCTTTGC AGGTTAACAT CCGAAAAAGA 
ATCTCTGCTC ATCAGCCAGG GCCTGAAGGC 
ACAGAGGATG GTGAGAGCCC TGAAGATCCC 
AGAAACCAGA TCAGCAGCCC CTGAGCGGAG 
GTTCAGGCTC CTGGGAAGAT GCAGCTTTGC 
CTGCTTCTGC TTCTGCCCCT GTCCTAGAGA 
CTCTCCGACA GCGCCCCACC TGCTCTAGTT 
CCAGCACTTT CCCACCTGAC TCCTCTCCCC 
CAAAAAAAAA AAAAAAAAAA AATTCCTGCG 



SEQ ID NO: 15: 



ACGGTTGCCC 


TGGCAGCGCG 


CGAGGCTGGT 


60 


TGGTTTCCAT 


GGTTGCACGA 


TTAGGAACCA 


120 


CGTCCAGGTG 


GCAGAGCAGC 


TAGGAACGCA 


180 


CTCCCGGCCA 


TGGTCAACGA 


CCCTCCAGTA 


240 


CAAGTCTTGG 


CAGGCCGTGC 


CCGCAGGCTG 


300 


ATCCTCCTTT 


TGCTCTGGGT 


GTCTGTCTTC 


360 


CCGACAGTCA 


GCCACCTCAG 


CCCTGTGCAT 


420 


ACCACCTCAC 


TCTGCTCCTT 


CCCTGTTGCC 


480 


CGGGTGCTGA 


TGTATGGACA 


GCCGTATCGT 


540 


CCTGTGAATC 


AAGATTTGGG 


CATGTTCTTG 


600 


CGAATCATCT 


CCACTTCTTC 


GCGTTCGGTG 


660 


ATGCTGGACA 


CACTGGTCTT 


CTCTAGCCTC 


720 


CTGCTGGAGG 


TGGAACTCTA 


CGCAGACTAT 


780 


GCGATCATTG 


AGATCCACAG 


CAAGCGCATC 


840 


GCGCACTTCA 


CTGGGCTCAG 


ATACCTGCTA 


900 


GGTGTTGCCA 


GCAACTTCAC 


CTTCCTCAGC 


960 


GTGTGGGGGG 


GCATCTGGCC 


CCGACACCGC 


1020 


GACAATTCCC 


GGAAGGAAGT 


CCAACGAAGG 


1080 


CAGGAGGAGT 


CAACTCCGCA 


ATCAGATGTT 


1140 


TCAGGGACAG 


AGGTCAGCTG 


TCCGAGGAGG 


1200 


AAGAGGAGCT 


AGAGCCTGAG 


GCCAGTGATG 


1260 


TGACGGAGGC 


CAACCTGCCT 


GCTCCTGCTC 


1320 


CTCTGGGCAG 


CTCTGAACCT 


GCTGGGGGTG 


1380 


CCTGAAGAAA 


AGGGGCAGAC 


TCCTCACATT 


1440 


TCGTTTTTCC 


TTCAATAAAC 


TATTTTGTGT 


1500 


GCCGC 






1535 
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(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1322 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: 



GAATTCGGCA CGAGGGCGGG CGCTACGGGC 
AGGTGCTGGC GCCGCTGCCC CTCCACGGAG 
CCCGGTTCTT TGTCCCTCCT AATATCAAAC 

^ GCACGTTTAA AGAGAAAATA TCACGGGCCG 

III 

m ACAATAATAA ATCCAAAGAG GAGCCAGTTA 

|y TTGCTGTCAT GATAACAGAA TTGAGGGGTA 

I* TCTCTGTACA AATGACAATA GCTGTTGGAA 

B GCTCTCTAGT CTTCGTGTCA ATATCCTTTA 

13 TCATATTCTA CTTCATTCAA AAGATCAGGT 

In* GTCTCGGAGA TGCAGCCAAG AAAGCCATCA 

FU GTGACAAGGA AACTGACCCA GACTTTGATC 

;f AGAATGATGT CGTCCGAATT CTCCCCTGCA 

CCTGGCTTAG TGAACATTGT ACCTGTCCTA 
GAATTGTGCC GAATTTGCCA TGTACTGATA 
GAACCCAAGC TGTTAACCGA AGATCAGCCC 
GCCTTGAGCC ACTTCGAACT TCGGGGATCT 
CGAGAACAGG AGAAATCAAC ATTGCAGTAA 
GCCTCGTCAG TGCCCTCACA CTCTGCTACA 
CTAATGAGGT AGAATGGTTT TGAAGAAGAA 
AAGGAAAAAA GAACCTATTT TTGTGCATCA 
TTTAGTACAT TTTATTTTTT CATAAAATTG 
AAATAATAAA ATAAAAAAAA AAAAAAAAAA 
GC 



(2) INFORMATION FOR SEQ 



SEQ ID NO: 16: 



TTGACTCCCC 


CAAGGCCGAG 


GTCCGCGGCC 


60 


TTGCTGATCA 


TCTGGGCTGT 


GATCCACAAA 


120 


AGTGGATTGC 


CTTGCTGCAG 


AGGGGAAACT 


180 


CTTTCCACAA 


TGCAGTTGCT 


GTAGTCATCT 


240 


CCATGACTCA 


TCCAGGCACT 


GGAGATATTA 


300 


AGGATATTTT 


GAGTTATCTG 


GAGAAAAACA 


360 


CTCGAATGCC 


ACCGAAGAAC 


TTCAGCCGTG 


420 


TTGTTTTGAT 


GATTATTTCT 


TCAGCATGGC 


480 


ACACAAATGC 


ACGCGACAGG 


AACCAGCGTC 


540 


GTAAATTGAC 


AACCAGGACA 


GTAAAGAAGG 


600 


ATTGTGCAGT 


CTGCATAGAG 


AGCTATAAGC 


660 


AGCATGTTTT 


CCACAAATCC 


TGCGTGGATC 


720 


TGTGCAAACT 


TAATATATTG 


AAGGCCCTGG 


780 


ACGTAGCATT 


CGATATGGAA 


AGGCTCACCA 


840 


TCGGCGACCT 


CGCCGGCGAC 


AACTCCCTTG 


900 


CACCTCTTCC 


TCAGGATGGG 


GAGCTCACTC 


960 


CAAAAGAATG 


GTTTATTATT 


GCCAGTTTTG 


1020 


TGATCATCAG 


AGCCACAGCT 


AGCTTGAATG 


1080 


AAAACCTGCT 


TTCTGACTGA 


TTTTGCCTTG 


1140 


TTTACCAATC 


ATGCCACACA 


AGCATTTATT 


1200 


CTAATGCCAA 


AGCTTTGTAT 


TAAAAGAAAT 


1260 


AAAAAAAAAA 


AAAAAAAAAT 


TCCTGCGGCC 


1320 
1322 



ID NO: 17: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1711 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: 



GAATTCGGCA CGAGGCCCTC CCGCGCTCCC 
ACATATACTC AGGTGCGCCC CACCTGTCCG 
CTCTGCTGCG CACCGCAGCC TCGGACCTAC 
TCAGAAACGC GCCCAGACGG CCCCTCCACC 
*.f1 CGGAGGGAAC CGCCTGGCCT TCGGGGACCA 

^3 TATCCTACTC CCTGTGCCGC GAGGCCATCG 

s e 

|fi TTGGTGACAA GATTTGCATT CACCTGGCCC 

ly AAACTCCACC TCAAGTTTTC TTTTGTGGGG 

25; AGGGTCTCCC GCCCGGCGCC CCCAGTGTTT 

- GCAGTTGCTG GGCTTCTCCA TGGCCCTGCT 

^ CATCCCGCAG TGGCAGATGA GCTCCTATGC 

fsI GTACAAGGGG CTGTGGATGG ACTGCGTCAC 

f Ij GTACGACTCG GTGCTCGCCC TGTCCGCGGC 

^- CTCCCTGGTG CTGGGCTTCC TGGCCATGTT 

CTGTGGGGGA GACGACAAAG TGAAGAAGGC 
CATCGTGGCA GGTCTTGCCG CCTTGGTAGC 
AGACTTTTAT AACCCTTTGA TCCCTACCAA 
TATTGGCTGG GCAGGGTCTG CCCTAGTCAT 
TCCTGGGAAT GAGAGCAAGG CTGGGTACCG 
TTCCAAGGAG TATGTGTGAC CTGGGATCTC 
TCTAGATGCC TGAAAGGGCC TGGGGCTGAG 
GCCTCCTGGT CACTCTGTCC CTGCACTCCA 
GGTGCCGTTG GTGGGAGAGA CAAAAAGAGG 
ATAAGTATTG GGAAGCAGGC TTTTTTCCCT 
CCTTGCAGGG AGCTTGGAAC CTTAGTGCAC 
CTGACTCCAC TGACAATTGA CTAAAAGATG 
CCCCCCTCTT ATTTAAATAG CTACCAAAGT 



SEQ ID NO: 17: 



GGGGCGCGCG 


GGCCGCGCCC 


CCGACGCCCT 


60 


CCCGCACCTG 


CTGGCTCACC 


TCCGAGCCAC 


120 


AGCCCAGGAT 


ACTTTGGGAC 


TTGCCGGCGC 


180 


TTTTGTTTGC 


CTAGGGTCGC 


CGAGAGCGCC 


240 


CCAATTTTGT 


CTGGAACCAC 


CCTCCCGGCG 


300 


CTTCACTGGA 


GGGGTCGATT 


TGTGTGTAGT 


360 


AAACCCTTTT 


TGTCTCTTTG 


GGTGACCGGA 


420 


CTGCCCCCCA 


AGTGTCGTTT 


GTTTTACTGT 


480 


TCTGAGGGCG 


GAAATGGCCA 


ATTCGGGCCT 


540 


GGGCTGGGTG 


GGTCTGGTGG 


CCTGCACCGC 


600 


GGGTGACAAC 


ATCATCACGG 


CCCAGGCCAT 


660 


GCAGAGCACG 


GGGATGATGA 


GCTGCAAAAT 


720 


CTTGCAGGCC 


ACTCGAGCCC 


TAATGGTGGT 


780 


TGTGGCCACG 


ATGGGCATGA 


AGTGCACGCG 


840 


CCGTATAGCC 


ATGGGTGGAG 


GCATAATTTT 


900 


TTGCTCCTGG 


TATGGCCATC 


AGATTGTCAC 


960 


CATTAAGTAT 


GAGTTTGGCC 


CTGCCATCTT 


1020 


CCTGGGAGGT 


GCACTGCTCT 


CCTGTTCCTG 


1080 


TGCACCCCGC 


TCTTACCCTA 


AGTCCAACTC 


1140 


CTTGCCCCAG 


CCTGACAGGC 


TATGGGAGTG 


1200 


CTCAGCCTGT 


GGGCAGGGTG 


CCGGACAAAG 


1260 


TGTATAGTCC 


TCTTGGGTTG 


GGGGTGGGGG 


1320 


GAGAGTGTGC 


TTTTTGTACA 


GTAATAAAAA 


1380 


TCAGGGCCTC 


TGCTTTCCTC 


CCGTCCAGAT 


1440 


CTACTTCAGT 


TCAGAACACT 


TAGCACCCCA 


1500 


CAGGTGCTCG 


TATCTCGACA 


TTCATTCCCA 


1560 


ACTTCTTTTT 


TAATAAAAAA 


ATAAAGATTT 


1620 
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TTATTAGGTA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAATT CCTGCGGCCG C 



1680 
1711 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1553 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



4f (xi) SEQUENCE DESCRIPTION: 

£■ ' * 

; r% 

Ul GAATTCGGCA CGAGGGCAGG TCCAGAGTAA 

- *1 

7l CAGGATGATT AGACCTCAGC TGCGGACCGC 

E 5 H 

k Q GCTGCTGCTC CTGGTGCCCG TCCTCTGGGC 

«3 CTGCCCCGCG GTCTGCCAGC CCACGCGCTG 

^ CACGCCGGTG TTCGACCTGT GCCGCTGTTG 

CO CTGCGGCGGG GCGCAGGGCC AACCGTGCGC 

^ff CCCCGGGTTC CCCAGCACCT GCGGTTGCCC 

Q CAGGCGCACC TACCCCAGCA TGTGCGCGCT 

H GGGCAAGGTC CCGGCCGTGC CTGTGCAGTG 

CGCAGGCCCG CTCAGGAGGA ATTACAACTT 
ATCGGTGGTT CACGTGCAGC TGTGGGGCAG 
GTACAGTGGC TCTGGGTTCA TAGTGTCTGA 
TGTCAGGAAC CAGCAGTGGA TTGAGGTGGT 
TGTCAAGGAT ATTGACCTTA AATTGGATCT 
ACTTCCTGTA CTGATGCTGG GAAGATCATC 
TTTGGGCAGC CCATTTTCTC TGCAGAACAC 
GCGAGGGGGC AAAGAACTGG GGATGAAGGA 
CACAATTAAC TATGGGAATT CTGGTGGTCC 
CGTCAATTCA TTGAGGGTGA CTGATGGAAT 
GCAGTTCTTG GCAGAATACC ATGAGCACCA 
ATATCTGGGT CTGCAAATGC TGTCCCTCAC 
TTATCCAGAT TTCCCTGATG TGAGTTCTGG 



SEQ ID NO: 18: 

AGTCACTGAA GAGTGGAAGC GAGGAAGGAA 60 

GGGGCTGGGA CGATGCCTCC TGCCGGGGCT 120 

CGGGGCTGAA AAGCTACATA CCCAGCCCTC 180 

CCCCGCGCTG CCCACCTGCG CGCTGGGGAC 240 

CCGCGTCTGC CCCGCGGCCG AGCGTGAAGT 300 

CCCGGGGCTG CAGTGCCTCC AGCCGCTGCG 360 

GACGCTGGGA GGGGCCGTGT GCGGCAGCGA 420 

CCGGGCCGAA AACCGCGCCG CGCGCCGCCT 480 

GGGGAACTGC GGGGATACAG GGACCAGAAG 540 

CATCGCCGCG GTGGTGGAGA AGGTGGCGCC 600 

GTTACTTCAC GGCAGCAGGC TTGTTCCTGT 660 

GGACGGGCTC ATTATTACCA ATGCCCATGT 720 

GCTCCAGAAT GGGGCCCGTT ATGAAGCTGT 780 

TGCGGTGATT AAGATTGAAT CAAATGCTGA 840 

TGACCTTCGG GCTGGAGAGT TTGTGGTGGC 900 

AGCTACTGCA GGAATTGTCA GCACCAAACA 960 

TTCAGATATG GACTACGTCC AGATTGATGC 1020 

TCTGGTGAAC TTGGATGGTG ATGTGATTGG 1080 

CTCCTTTGCA ATTCCTTCAG ATCGAGTTAG 1140 

GATGAAAGGA AAGGCGTTTT CAAATAAGAA 1200 

TGTGCCCCTT AGTGAAGAAT TGAAAATGCA 1260 

GGTTTATGTA TGTAAAGTGG TTGAAGGAAC 1320 
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AGCTGCTCAA 


AGCTCTGGAT 


TGAGAGATCA 


CGATGTAATT 


GTCAACATAA 


ATGGGAAACC 


1380 


TATTACTACT 


ACAACTGATG 


TTGTTAAAGC 


TCTTGACAGT 


GATTCCCTTT 


CCATGGCTGT 


1440 


TCTTCGGGGA 


AAAGATAATT 


TGCTCCTGAC 


AGTCATACCT 


GAAACAATCA 


ATTAAATATC 


1500 


TTGTTTTAAA 


GTGGGATTAT 


CTAAAAAAAA 


AAAAAAAAAA 


TTCCTGCGGC 


CGC 


1553 



(2) INFORMATION FOR SEQ ID NO: 19 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1596 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 



^0 
hJ 

in 

J* * 

I 5 E 
ft: I 



s 3 e 



r: E 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



GAATTCGGCA 


CGAGGGGAGC 


CGCTCCCGGA 


GCCCGGCCGT 


AGAGGCTGCA 


ATCGCAGCCG 


60 


GGAGCCCGCA 


GCCCGCGCCC 


CGAGCCCGCC 


GCCGCCCTTC 


GAGGGCGCCC 


CAGGCCGCGC 


120 


CATGGTGAAG 


GTGACGTTCA 


ACTCCGCTCT 


GGCCCAGAAG 


GAGGCCAAGA 


AGGACGAGCC 


180 


CGAGAGCGGC 


GAGGAGGCGC 


TCATCATCCC 


CCCCGACGCC 


GTCGCGGTGG 


ACTGCAAGGA 


240 


CCCAGATGAT 


GTGGTACCAG 


TTGGCCAAAG 


AAGAGCCTGG 


TGTTGGTGCA 


TGTGCTTTGG 


300 


ACTAGCATTT 


ATGCTTGCAG 


GTGTTATTCT 


AGGAGGAGCA 


TACTTGTACA 


AATATTTTGC 


360 


ACTTCAACCA 


GATGACGTGT 


ACTACTGTGG 


AATAAAGTAC 


ATCAAAGATG 


ATGTCATCTT 


420 


AAATGAGCCC 


TCTGCAGATG 


CCCCAGCTGC 


TCTCTACCAG 


ACAATTGAAG 


AAAATATTAA 


480 


AATCTTTGAA 


GAAGAAGAAG 


TTGAATTTAT 


CAGTGTGCCT 


GTCCCAGAGT 


TTGCAGATAG 


540 


TGATCCTGCC 


AACATTGTTC 


ATGACTTTAA 


CAAGAAACTT 


ACAGCCTATT 


TAGATCTTAA 


600 


CCTGGATAAG 


TGCTATGTGA 


TCCCTCTGAA 


CACTTCCATT 


GTTATGCCAC 


CCAGAAACCT 


660 


ACTGGAGTTA 


CTTATTAACA 


TCAAGGCTGG 


AACCTATTTG 


CCTCAGTCCT 


ATCTGATTCA 


720 


TGAGCACATG 


GTTATTACTG 


ATCGCATTGA 


AAACATTGAT 


CACCTGGGTT 


TCTTTATTTA 


780 


TCGACTGTGT 


CATGACAAGG 


AAACTTACAA 


ACTGCAACGC 


AGAGAAACTA 


TTAAAGGTAT 


840 


TCAGAAACGT 


GAAGCCAGCA 


ATTGTTTCGC 


AATTCGGCAT 


TTTGAAAACA 


AATTTGCCGT 


900 


GGAAACTTTA 


ATTTGTTCTT 


GAACAGTCAA 


GAAAAACATT 


ATTGAGGAAA 


ATTAATATCA 


960 


CAGCATAACC 


CCACCCTTTA 


CATTTTGTGC 


AGTGATATTT 


TTTAAAGTCT 


CTTTCATGTA 


1020 


AGTAGCAAAC 


AGGGCTTTAC 


TATCTTTTCA 


TCTCATTAAT 


TCAATTAAAA 


CCATTACCTT 


1080 


AAAATTTTTT 


TCTTTCGAAG 


TGTGGTGTCT 


TTTATATTTG 


AATTAGTAAC 


TGTATGAAGT 


1140 
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i t 
fit 

S.M 

■» * 
: t e 



Q 



fU 

P3 • 



CATAGATAAT 


AGTACATGTC 


ACCTTAGGTA 


GTAGGAAGAA 


TTACAATTTC 


TTTAAATCAT 


1200 


TTATCTGGAT 


TTTTATGTTT 


T ATT AG CATT 


TTCAAGAAGA 


CGGATTATCT 


AGAGAATAAT 


1260 


CATATATATG 


CATACGTAAA 


AATGGACCAC 


AGTGACTTAT 


TTGTAGTTGT 


TAGTTGCCCT 


1320 


GCTACCTAGT 


TTGTTAGTGC 


ATTTGAGCAC 


ACATTTTAAT 


TTTCCTCTAA 


TTAAAATGTG 


1380 


CAGTATTTTC 


AGTGTCAAAT 


ATATTTAACT 


ATTTAGAGAA 


TGATTTCCAC 


CTTTATGTTT 


1440 


TAATATCCTA 


GGCATCTGCT 


GTAATAATAT 


TTTAGAAAAT 


GTTTGGAATT 


TAAGAAATAA 


1500 


CTTGTGTTAC 


TAATTTGTAT 


AACCCATATC 


TGTGCAATGG 


AATATAAATA 


TCACAAAGTT 


1560 


GTTTAAAAAA 


AAAAAAAAAA 


AAATTCCTGC 


GGCCGC 






1596 



(2) INFORMATION FOR SEQ ID NO :20s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 400 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 



O Met Ala Trp Arg Arg Arg Glu Ala Gly Val Gly Ala Arg Gly Val Leu 

15 10 15 

Ala Leu Ala Leu Leu Ala Leu Ala Leu Cys Val Pro Gly Ala Arg Gly 

20 25 30 

Arg Ala Leu Glu Trp Phe Ser Ala Val Val Asn lie Glu Tyr Val Asp 

35 40 45 

Pro Gin Thr Asn Leu Thr Val Trp Ser Val Ser Glu Ser Gly Arg Phe 

5Q 55 60 

Gly Asp Ser Ser Pro Lys Glu Gly Ala His Gly Leu Val Gly Val Pro 
65 70 75 80 

Trp Ala Pro Gly Gly Asp Leu Glu Gly Cys Ala Pro Asp Thr Arg Phe 

85 90 95 

Phe Val Pro Glu Pro Gly Gly Arg Gly Ala Ala Pro Trp Val Ala Leu 

100 105 110 

Val Ala Arg Gly Gly Cys Thr Phe Lys ABp Lys Val Leu Val Ala Ala 



47 



115 

Arg Arg Asn Ala 
130 

Asn lie Thr Leu 
145 

lie Met lie Ser 

Lys Gly lie Pro 

180 

Gin Glu Phe He 
195 

lie Thr Met Met 
210 

Gin Arg Phe Leu 
225 

Lys Glu Thr Lys 

His Gly Glu Lys 

260 

He Glu Asn Phe 
275 

His lie Phe His 
290 

Thr Cys Pro Met 
305 

Gly Glu Pro Gly 

Gly Arg Asp Pro 

340 

Gly Ser Asp Asp 
355 

Pro Gin Cys Asp 
370 

Leu Leu Glu Ala 
385 



120 

Ser Ala Val Val 
135 

Pro Met Ser His 
150 

Tyr Pro Lys Gly 
165 

Val Thr Met Thr 

Ser Gly Gin Ser 

200 

lie lie Ser Leu 
215 

Tyr Thr Gly Ser 
230 

Lys Val lie Gly 
245 

Gly He Asp Val 

Lys Val Lys Asp 

280 

Arg He Cys He 
295 

Cys Lys Leu Asp 
310 

Asp Val Gin Glu 
325 

Ala Ala Asn Leu 

Ser Ser Pro Pro 

360 

Pro Ser Phe Lys 
375 

Gly Arg Ser Asp 
390 



L u Tyr Asn Glu 

140 

Ala Gly Thr Gly 
155 

Arg Glu He Leu 
170 

He Gly Val Gly 
185 

Val Val Phe Val 

Ala Trp Leu He 

220 

Gin He Gly Ser 
235 

Gin Leu Leu Leu 
250 

Asp Ala Glu Asn 
265 

He He Arg He 

Asp Pro Trp Leu 

300 

Val He Lys Ala 
315 

Met Pro Ala Pro 
330 

Ser Leu Ala Leu 
345 

Ser Ala Ser Pro 

Gly Asp Ala Gly 

380 

Ser Arg His Gly 
395 



125 

Glu Arg Tyr Gly 

Asn He Val Val 

160 

Glu Leu Val Gin 
175 

Thr Arg His Val 
190 

Ala He Ala Phe 
205 

Phe Tyr Tyr He 

Gin Ser His Arg 

240 

His Thr Val Lys 
255 

Cys Ala Val Cys 
270 

Leu Pro Cys Lys 
285 

Leu Asp His Arg 

Leu Gly Tyr Trp 

320 

Glu Ser Pro Pro 
335 

Pro Asp Asp Asp 
350 

Ala Glu Ser Glu 
365 

Glu Asn Thr Ala 

Gly Pro He Ser 

400 
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(2) INFORMATION FOR SEQ ID NO: 21: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 291 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Met Asp Lys Gly Ser Ala Gly His Pro Gly Gly Val Leu Val Trp Gly 

15 10 15 

Arg Ser Pro Ala Pro Thr Ala Leu Trp Gly Ala Ser Pro Trp Leu Ser 

20 25 30 

Pro Leu Thr Ser Ala Leu Arg Gin Pro Leu His Arg Ala Pro Leu Leu 

35 40 45 

Pro Gly Gin Leu Cys Trp Ser Pro Arg Pro Leu Glu Lys Asn Lys Ala 

50 55 60 

Met Gly Arg Pro Leu Leu Leu Pro Leu Leu Leu Leu Leu Gin Pro Pro 
65 70 75 80 

Ala Phe Leu Gin Pro Gly Gly Ser Thr Gly Ser Gly Pro Ser Tyr Leu 

85 90 95 

Tyr Gly Val Thr Gin Pro Lys His Leu Ser Ala Ser Met Gly Gly Ser 

100 105 HO 

Val Glu He Pro Phe Ser Phe Tyr Tyr Pro Trp Glu Leu Ala He Val 

115 120 125 

Pro Asn Val Arg He Ser Trp Arg Arg Gly His Phe His Gly Gin Ser 

130 135 140 

Phe Tyr Ser Thr Arg Pro Pro Ser He His Lys Asp Tyr Val Asn Arg 
145 150 155 160 

Leu Phe Leu Asn Trp Thr Glu Gly Gin Glu Ser Gly Phe Leu Arg He 

165 170 175 

Ser Asn Leu Arg Lys Glu Asp Gin Ser Val Tyr Phe Cys Arg Val Glu 

180 185 190 
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Leu Asp Thr Arg Arg Ser Gly Arg Gin Gin Leu Gin Ser lie Lys Gly 

195 200 205 

Thr Lys Leu Thr lie Thr Gin Ala Val Thr Thr Thr Thr Thr Trp Arg 

210 215 220 

Pro Ser Ser Thr Thr Thr lie Ala Gly Leu Arg Val Thr Glu Ser Lys 
225 230 235 240 

Gly His Ser Glu Ser Trp His Leu Ser Leu Asp Thr Ala He Arg Val 

245 250 255 

Ala Leu Ala Val Ala Val Leu Lys Thr Val He Leu Gly Leu Leu Cys 

260 265 270 

Leu Leu Leu Leu Trp Trp Arg Arg Arg Lys Gly Ser Arg Ala Pro Ser 

275 280 285 

Ser Asp Phe 
290 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 293 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Met Thr Val Ser Gin Arg Phe Gin Leu Ser Asn Ser Gly Pro Asn Ser 

15 10 15 

Thr He Lys Met Lys He Ala Leu Arg Val Leu His Leu Glu Lys Arg 

20 25 30 

Glu Arg Pro Pro Asp His Gin His Ser Ala Gin Val Lys Arg Pro Ser 

35 40 45 

Val Ser Lys Glu Gly Arg Lys Thr Ser He Lys Ser His Met Ser Gly 

50 55 60 

Ser Pro Gly Pro Gly Gly Ser Asn Thr Ala Pro Ser Thr Pro Val He 
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65 

Gly Gly Ser Asp 

Ala Gly Pro Gin 

100 

Leu Ala Ser Pro 
115 

Ala Ser Asp lie 
130 

Leu Arg Gin Leu 
145 

Gin lie Gin Leu 

Val Val Val His 

180 

Ser Asp Pro Tyr 
195 

Gly Arg Arg Lys 
210 

Asp Gin Ser Phe 
225 

Thr Leu Asp Val 

Lys Gly Leu Leu 

260 

Ala Lys Gly Trp 
275 

Pro Gin Ala Met 
290 



70 

Lys Pro Gly Met 
85 

Gly Leu His Asp 

Gly His lie Ser 

120 

Ser Leu Pro lie 
135 

Glu Asn Gly Thr 
150 

Thr lie Arg His 
165 

Ala Cys Arg Asn 

Val Arg Met Tyr 

200 

Thr His Val Ser 
215 

Asp Phe Ser Val 
230 

Ala Val Lys Asn 
245 

Gly Lys Val Leu 

Thr Gin Trp Tyr 

280 

Thr 



75 

Glu Glu Lys Ala 
90 

Leu Gly Arg Ser 
105 

Val Lys Glu Pro 

Ala Thr Gin Glu 

140 

Thr Leu Gly Gin 
155 

Ser Ser Gin Arg 
170 

Leu lie Ala Phe 
185 

Leu Leu Pro Asp 

Lys Lys Thr Leu 

220 

Ser Leu Pro Glu 
235 

Ser Gly Gly Phe 
250 

Val Ala Leu Ala 
265 

Asp Leu Thr Glu 



80 

Gin Pro Pro Glu 
95 

Ser Ser Ser Leu 
110 

Thr Pro Ser lie 
125 

Leu Arg Gin Arg 

Ser Pro Leu Gly 

160 

Asn Lys Leu lie 
175 

Ser Glu Asp Gly 
190 

Lys Arg Arg Ser 
205 

Asn Pro Val Phe 

Val Gin Arg Arg 

240 

Leu Ser Lys Asp 
255 

Ser Glu Glu Leu 
270 

Asp Gly Thr Arg 
285 



(2) INFORMATION FOR SEQ ID NO: 23: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 206 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



Met Glu Arg Arg 
1 

Phe Arg Cys Ser 

20 

Asp Thr Pro Asn 
35 

Tyr Thr Ser Gly 
50 

Lys Gly His Cys 
65 

lie Pro Arg Trp 

Thr Tyr Gly Gly 

100 

Gin Cys Leu Glu 
115 

Leu Arg Arg Glu 
130 

Val Ala Val Phe 
145 

Gly Tyr Cys Phe 

His His Pro Pro 

180 

Asp Thr Glu His 
195 



His Pro Val Cys 
5 

Asn Gly Cys Cys 

Cys Pro Asp Ala 

40 

Phe Asp Glu Leu 
55 

Val Asp Leu Pro 
70 

Tyr Tyr Asn Pro 
85 

Cys Tyr Gly Asn 

Ser Cys Arg Gly 

120 

lie Pro lie Pro 
135 

Leu Val lie Cys 
150 

Phe Lys Asn Gin 
165 

Pro Thr Pro Ala 

Leu Val Tyr Asn 

200 



Ser Gly Thr Cys 
10 

lie Asp Ser Phe 
25 

Ser Asp Glu Ala 

Gin Arg lie His 

60 

Asp Thr Gly Leu 
75 

Phe Ser Glu His 
90 

Lys Asn Asn Phe 
105 

lie Ser Lys Lys 

Ser Thr Gly Ser 

140 

lie Val Val Val 
155 

Arg Lys Asp Phe 
170 

Ser Ser Thr Val 
185 

His Thr Thr Arg 



Gin Pro Thr Gin 
15 

Leu Glu Cys Asp 
30 

Ala Cys Glu Lys 
45 

Phe Pro Ser Asp 

Cys Lys Glu Ser 

80 

Cys Ala Arg Phe 
95 

Glu Glu Glu Gin 
110 

Asp Val Phe Gly 
125 

Val Glu Met Ala 

Val Ala He Leu 

160 

His Gly His His 
175 

Ser Thr Thr Glu 

190 
Pro Leu 
205 



(2) INFORMATION FOR SEQ ID NO: 24: 



(i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH: 220 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



Met Ala Gly Leu 
1 

Leu Ala Ser Thr 

20 

Gly Gly Arg Asp 
35 

Leu Pro Pro Arg 
50 

Val Ser Asp Trp 
65 

Arg Gly Arg Pro 

Gly Ala Gly Ser 

100 

Ser Val Ser Asn 
115 

Leu Ala Gin Thr 
130 

Pro Leu Cys Val 
145 

Glu Thr Glu Met 

Glu Met Lys Thr 

180 

Asn lie Thr Asn 
195 

Val Thr Pro Glu 



Ser Arg Gly Ser 
5 

Leu Leu Ala Leu 

His Gly Asp Trp 

40 

Glu Asp Ala Ala 
55 

Gly Ala Leu Ala 
70 

Phe Ala Asp Val 
85 

Gly Val Pro Tyr 

Leu Gin Glu Asn 

120 

Asn Phe Cys Lys 
135 

His lie Met Leu 
150 

Asp lie Ala Lys 
165 

Trp Pro Ser Ser 

lie Trp Val Leu 

200 

Glu Tyr Tyr Asn 



Ala Arg Ala Leu 
10 

Leu Val Ser Pro 
25 

Asp Glu Ala Ser 

Arg Val Ala Arg 

60 

Thr lie Ser Thr 
75 

Leu Ser Leu Ser 
90 

Phe Tyr Leu Ser 
105 

Pro Tyr Ala Thr 

Lys His Gly Phe 

140 

Ser Gly Thr Val 
155 

His Ser Leu Phe 
170 

His Asn Trp Phe 
185 

Asp Tyr Phe Gly 
Val Thr Val Gin 



Leu Ala Ala Leu 
15 

Ala Arg Gly Arg 
30 

Arg Leu Pro Pro 
45 

Phe Val Thr His 

Leu Glu Ala Val 

80 

Asp Gly Pro Pro 
95 

Pro Leu Gin Leu 
110 

Leu Thr Met Thr 
125 

Asp Pro Gin Ser 

Thr Lys Val Asn 

160 

lie Arg His Pro 
175 

Phe Ala Lys Leu 
190 

Gly Pro Lys lie 
205 
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210 



215 



220 



(2) INFORMATION FOR SEQ ID NO: 25: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 197 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 



Met Asp His His Cys Pro Trp Leu Asn Asn Cys Val Gly His Tyr Asn 

15 10 15 

His Arg Tyr Phe Phe Ser Phe Cys Phe Phe Met Thr Leu Gly Cys Val 

20 25 30 

Tyr Cys Ser Tyr Gly Ser Trp Asp Leu Phe Arg Glu Ala Tyr Ala Ala 

35 40 45 

lie Glu Lys Met Lys Gin Leu Asp Lys Asn Lys Leu Gin Ala Val Ala 

50 55 60 

Asn Gin Thr Tyr His Gin Thr Pro Pro Pro Thr Phe Ser Phe Arg Glu 
65 70 75 80 

Arg Met Thr His Lys Ser Leu Val Tyr Leu Trp Phe Leu Cys Ser Ser 

85 90 95 

Val Ala Leu Ala Leu Gly Ala Leu Thr Val Trp His Ala Val Leu lie 

100 105 110 

Ser Arg Gly Glu Thr Ser lie Glu Arg His lie Asn Lys Lys Glu Arg 

115 120 125 

Arg Arg Leu Gin Ala Lys Gly Arg Val Phe Arg Asn Pro Tyr Asn Tyr 

130 135 140 

Gly Cys Leu Asp Asn Trp Lys Val Phe Leu Gly Val Asp Thr Gly Arg 
145 150 155 160 

His Trp Leu Thr Arg Val Leu Leu Pro Ser Thr His Leu Pro His Gly 

165 170 175 
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Asn GXy Met Ser Trp Glu Pro Pro Pro Trp Val Thr Ala His Ser Ala 

180 185 190 

Ser Val Met Ala Val 
195 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 451 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: None 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Ala Pro Leu Gly Met Leu Leu Gly Leu Leu Met Ala Ala Cys Phe 

15 10 15 

Thr Phe Cys Leu Ser His Gin Asn Leu Lys Glu Phe Ala Leu Thr Asn 

20 25 30 

Pro Glu Lys Ser Ser Thr Lys Glu Thr Glu Arg Lys Glu Thr Lys Ala 

35 40 45 

Glu Glu Glu Leu Asp Ala Glu Val Leu Glu Val Phe His Pro Thr His 

50 55 60 

Glu Trp Gin Ala Leu Gin Pro Gly Gin Ala Val Pro Ala Gly Ser His 
65 70 75 80 

Val Arg Leu Asn Leu Gin Thr Gly Glu Arg Glu Ala Lys Leu Gin Tyr 

85 90 95 

Glu Asp Lys Phe Arg Asn Asn Leu Lys Gly Lys Arg Leu Asp lie Asn 

100 105 110 

Thr Asn Thr Tyr Thr Ser Gin Asp Leu Lys Ser Ala Leu Ala Lys Phe 

115 120 125 

Lys Glu Gly Ala Glu Met Glu Ser Ser Lys Glu Asp Lys Ala Arg Gin 

130 135 140 

Ala Glu Val Lys Arg Leu Phe Arg Pro lie Glu Glu Leu Lys Lys Asp 
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145 

Phe Asp Glu Leu 

Arg Leu lie Asn 

180 

lie Ala Ala Leu 
195 

Ala Gin Asp Leu 
210 

Leu Asn Ser Thr 
225 

Gly Ala Ala Phe 

Gly Gly Ala Leu 

260 

Leu Thr Ala Lys 
275 

His Phe Pro Tyr 
290 

Val Leu Arg Thr 
305 

Arg Val Val Thr 

Glu Glu Glu Ala 

340 

Gin Tyr Arg Gin 
355 

Cys Glu lie Thr 
370 

Glu Lys Val Leu 
385 

Arg Tyr Arg Gin 

Ala Glu Tyr Gin 

420 

Glu Gly Tyr Phe 



150 

Asn Val Val lie 
165 

Lys Phe Asn Ser 

Phe Asp Leu Glu 

200 

Leu Ser Phe Gly 
215 

Glu Pro Leu Val 
230 

Ser Ser Asn Pro 
245 

Gin Lys Leu Leu 

Lys Lys Val Leu 

280 

Ala Gin Arg Gin 
295 

Leu Val Gin Glu 
310 

Leu Leu Tyr Asp 
325 

Glu Leu Thr Gin 

Val His Leu Leu 

360 

Ala His Leu Leu 
375 

Gin Thr Leu Gly 
390 

Asp Pro Gin Leu 
405 

Val Leu Ala Ser 
Gin Glu Leu Leu 



155 

Glu Thr Asp Met 
170 

Ser Ser Ser Ser 
185 

Tyr Tyr Val His 

Gly Leu Gin Val 

220 

Lys Glu Tyr Ala 
235 

Lys Val Gin Val 
250 

Val lie Leu Ala 
265 

Phe Ala Leu Cys 

Phe Leu Lys Leu 

300 

Lys Gly Thr Glu 
315 

Leu Val Thr Glu 
330 

Glu Met Ser Pro 
345 

Pro Gly Leu Trp 

Ala Leu Pro Glu 

380 

Val Leu Leu Thr 
395 

Gly Arg Thr Leu 
410 

Leu Glu Leu Gin 
425 

Gly Ser Val Asn 



160 

Gin lie Met Val 
175 

Leu Glu Glu Lys 
190 

Gin Met Asp Asn 
205 

Val lie Asn Gly 

Ala Phe Val Leu 

240 

Glu Ala lie Glu 
255 

Thr Glu Gin Pro 
270 

Ser Leu Leu Arg 
285 

Gly Gly Leu Gin 

Val Leu Ala Val 

320 

Lys Met Phe Ala 
335 

Glu Lys Leu Gin 
350 

Glu Gin Gly Trp 
365 

His Asp Ala Arg 

Thr Cys Arg Asp 

400 

Ala Ser Leu Gin 
415 

Asp Gly Glu Asp 
430 

S r Leu Leu LyB 
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435 

Glu Leu Arg 
450 



440 



445 



(2) INFORMATION FOR SEQ ID NO; 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 254 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Met Trp Gin Ala Gly Lys Arg Gin Ala Ser Arg Ala Phe Ser Leu Tyr 

15 10 15 

Ala Asn lie Asp lie Leu Arg Pro Tyr Phe Asp Val Glu Pro Ala Gin 

20 25 30 

Val Arg Ser Arg Leu Leu Glu Ser Met lie Pro lie Lys Met Val Asn 

35 40 45 

Phe Pro Gin Lys lie Ala Gly Glu Leu Tyr Gly Pro Leu Met Leu Val 

50 55 60 

Phe Thr Leu Val Ala lie Leu Leu His Gly Met Lys Thr Ser Asp Thr 
65 70 75 80 

lie lie Arg Glu Gly Thr Leu Met Gly Thr Ala lie Gly Thr Cys Phe 

85 90 95 

Gly Tyr Trp Leu Gly Val Ser Ser Phe lie Tyr Phe Leu Ala Tyr Leu 

100 105 110 

Cys Asn Ala Gin lie Thr Met Leu Gin Met Leu Ala Leu Leu Gly Tyr 

115 120 125 

Gly Leu Phe Gly His Cys lie Val Leu Phe He Thr Tyr Asn He His 

130 135 140 

Leu His Ala Leu Phe Tyr Leu Phe Trp Leu Leu Val Gly Gly Leu Ser 
145 150 155 160 
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Thr Leu Arg Met Val Ala Val Leu Val Ser Arg Thr Val Gly Pro Thr 

165 170 175 

Gin Arg Leu Leu Leu Cys Gly Thr Leu Ala Ala Leu His Met Leu Phe 

180 185 190 

Leu Leu Tyr Leu His Phe Ala Tyr His Lys Val Val Glu Gly lie Leu 

195 200 205 

Asp Thr Leu Glu Gly Pro Asn lie Pro Pro lie Gin Arg Val Pro Arg 

210 215 220 

Asp lie Pro Ala Met Leu Pro Ala Ala Arg Leu Pro Thr Thr Val Leu 
225 230 235 240 

Asn Ala Thr Ala Lys Ala Val Ala Val Thr Leu Gin Ser His 

245 250 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 221 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

Met Gly Ser Glu Asn Glu Ala Leu Asp Leu Ser Met Lys Ser Val Pro 

15 10 I 5 

Trp Leu LyB Ala Gly Glu Val Ser Pro Pro He Phe Gin Glu Asp Ala 

20 25 30 

Ala Leu Asp Leu Ser Val Ala Ala His Arg Lys Ser Glu Pro Pro Pro 

35 40 45 

Glu Thr Leu Tyr Asp Ser Gly Ala Ser Val Asp Ser Ser Gly His Thr 

50 55 60 

Val Met Glu Lys Leu Pro Ser Gly Met Glu He Ser Phe Ala Pro Ala 
65 70 75 80 

Thr Ser His Glu Ala Pro Ala Met Met Asp Ser His He Ser Ser Ser 
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Asp Ala Ala Thr 

100 

Val Lys Ala Glu 
115 

Lys Val lie Val 
130 

Lys lie Lys Gly 
145 

Arg Glu Asp Ser 

Glu Ser Met Gly 

180 

Ser lie Lys Leu 
195 

lie Lys Lys Gin 
210 



85 

Glu Met Leu Ser 

Asn Asn lie Glu 

120 

Ser Val Glu Asp 
135 

Leu Ser Gly Val 
150 

Val Leu Gin Gly 
165 

Asn Ala Glu Pro 

Lys Lys Val Asn 

200 

Arg Leu Ala Thr 
215 



90 

Gin Pro Asn His 
105 

Met Val Gly Glu 

Ala Val Pro Thr 

140 

Ser Thr Lys Asn 
155 

Tyr Asp lie Asn 
170 

Leu Arg Lys Pro 
185 

Ser Gin Glu Val 

Phe Phe Pro Arg 

220 



95 

Pro Ser Gly Glu 
110 

Ser Gin Ala Ala 
125 

lie Phe Cys Gly 

Phe Ser Phe Lys 

160 

Ser Gin Gly Glu 
175 

lie Lys Asn Arg 
190 

His Met Leu Pro 

205 

Lys 



<2) INFORMATION FOR SEQ ID NO: 29: 



<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 266 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 



Met Val Lys Val 
1 

Lys Asp Glu Pro 

20 

Ala Val Ala Val 
35 



Thr Phe Asn Ser 

5 

Lys Ser Gly Glu 

Asp Cys Lys Asp 

40 



Ala Leu Ala Gin 
10 

Glu Ala Leu lie 
25 

Pro Asp Asp Val 



Lys Glu Ala Lys 
15 

lie Pro Pro Asp 
30 

Val Pro Val Gly 
45 
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Gin Arg Arg Ala Trp Cys Trp Cys Met Cys Phe Gly Leu Ala Phe Met 

50 55 60 

Leu Ala Gly Val lie Leu Gly Gly Ala Tyr Leu Tyr Lys Tyr Phe Ala 
65 70 75 80 

Leu Gin Pro Asp Asp Val Tyr Tyr Cys Gly lie Lys Tyr lie Lys Asp 

85 90 95 

Asp Val lie Leu Asn Glu Pro Ser Ala Asp Ala Pro Ala Ala Leu Tyr 

100 105 110 

Gin Thr lie Glu Glu Asn lie Lys lie Phe Glu Glu Glu Glu Val Glu 

115 120 125 

Phe lie Ser Val Pro Val Pro Glu Phe Ala Asp Ser Asp Pro Ala Asn 

130 135 140 

lie Val His Asp Phe Asn Lys Lys Leu Thr Ala Tyr Leu Asp Leu Asn 
145 150 155 160 

Leu Asp Lys Cys Tyr Val He Pro Leu Asn Thr Ser He Val Met Pro 

165 170 175 

Pro Arg Asn Leu Leu Glu Leu Leu He Asn He Lys Ala Gly Thr Tyr 

180 185 190 

Leu Pro Gin Ser Tyr Leu He His Glu His Met Val He Thr Asp Arg 

195 200 205 

He Glu Asn He Asp His Leu Gly Phe Phe He Tyr Arg Leu Cys His 

210 215 220 

Asp Lys Glu Thr Tyr Lys Leu Gin Arg Arg Glu Thr He Lys Gly He 
225 230 235 240 

Gin Lys Arg Glu Ala Ser Asn Cys Phe Ala He Arg His Phe Glu Asn 

245 250 255 

Lys Phe Ala Val Glu Thr Leu He Cys Ser 

260 265 

(2) INFORMATION FOR SEQ ID NO: 30: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 251 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



Met Pro Thr Gly Asp Phe Asp Ser Lys Pro Ser Trp Ala Asp Gin Val 

15 10 15 

Glu Glu Glu Gly Glu Asp Asp Lys Cys Val Thr Ser Glu Leu Leu Lys 

20 25 30 

Gly He Pro Leu Ala Thr Gly Asp Thr Ser Pro Glu Pro Glu Leu Leu 

35 40 45 

Pro Gly Ala Pro Leu Pro Pro Pro Lys Glu Val He Asn Gly Asn He 

50 55 60 

Lys Thr Val Thr Glu Tyr Lys He Asp Glu Asp Gly Lys Lys Phe Lys 
65 70 75 80 

He Val Arg Thr Phe Arg He Glu Thr Arg Lys Ala Ser Lys Ala Val 

85 90 95 

Ala Arg Arg Lys Asn Trp Lys Lys Phe Gly Asn Ser Glu Phe Asp Pro 

100 105 110 

Pro Gly Pro Asn Val Ala Thr Thr Thr Val Ser Asp Asp Val Ser Met 

115 120 125 

Thr Phe He Thr Ser Lys Glu Asp Leu Asn Cys Gin Glu Glu Glu Asp 

130 135 140 

Pro Met Asn Lys Phe Lys Gly Gin Lys He Val Ser Cys Arg He Cys 
145 150 155 160 

Lys Gly Asp His Trp Thr Thr Arg Cys Pro Tyr Lys Asp Thr Leu Gly 

165 170 175 

Pro Met Gin Lys Glu Leu Ala Glu Gin Leu Gly Leu Ser Thr Gly Glu 

180 185 190 

Lys Glu Lys Leu Pro Gly Glu Leu Glu Pro Val Gin Ala Thr Gin Asn 

195 200 205 

Lys Thr Gly Lys Tyr Val Pro Pro Ser Leu Arg Asp Gly Ala Ser Arg 

210 215 220 

Arg Gly Glu Ser Met Gin Pro Asn Arg Arg Ala Asp Asp Asn Ala Thr 
225 230 235 240 

He Arg Val Thr Asn Leu Arg Arg Gly His Ala 

245 250 
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(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 377 amino acids 
<B) TYPE: amino acid 

(C) STRAND ED NESS ; single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: 

Met Arg Arg Leu Asn Arg Lys Lys Thr Leu Ser Leu Val Lys Glu Leu 

15 10 15 

Asp Ala Phe Pro Lys Val Pro Glu Ser Tyr Val Glu Thr Ser Ala Ser 

20 25 30 

Gly Gly Thr Val Ser Leu lie Ala Phe Thr Thr Met Ala Leu Leu Thr 

35 40 45 

lie Met Glu Phe Ser Val Tyr Gin Asp Thr Trp Met Lys Tyr Glu Tyr 

50 55 60 

Glu Val Asp Lys Asp Phe Ser Ser Lys Leu Arg lie Asn lie Asp lie 
65 70 75 80 

Thr Val Ala Met Lys Cys Gin Tyr Val Gly Ala Asp Val Leu Asp Leu 

85 90 95 

Ala Glu Thr Met Val Ala Ser Ala Asp Gly Leu Val Tyr Glu Pro Thr 

100 105 110 

Val Phe Asp Leu Ser Pro Gin Gin Lys Glu Trp Gin Arg Met Leu Gin 

115 120 125 

Leu I\e Gin Ser Arg Leu Gin Glu Glu His Ser Leu Gin Asp Val lie 

130 135 140 

Phe Lys Ser Ala Phe Lys Ser Thr Ser Thr Ala Leu Pro Pro Arg Glu 
145 150 155 160 

Asp Asp Ser Ser Gin Ser Pro Asn Ala Cys Arg lie His Gly His Leu 

165 170 175 

Tyr Val Asn Lys Val Ala Gly Asn Phe His lie Thr Val Gly Lys Ala 

180 185 190 
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lie Pro His Pro Arg Gly His Ala His Leu Ala Ala Leu Val Asn His 

195 200 205 

Glu Ser Tyr Asn Phe Ser His Arg lie Asp His Leu Ser Phe Gly Glu 

210 215 220 

Leu Val Pro Ala lie lie Asn Pro Leu Asp Gly Thr Glu Lys lie Ala 
225 230 235 240 

lie Asp His Asn Gin Met Phe Gin Tyr Phe lie Thr Val Val Pro Thr 

245 250 255 

Lys Leu His Thr Tyr Lys lie Ser Ala Asp Thr His Gin Phe Ser Val 

260 265 270 

Thr Glu Arg Glu Arg lie lie Asn His Ala Ala Gly Ser His Gly Val 

275 280 285 

Ser Gly lie Phe Met Lys Tyr Asp Leu Ser Ser Leu Met Val Thr Val 

290 295 300 

Thr Glu Glu His Met Pro Phe Trp Gin Phe Phe Val Arg Leu Cys Gly 
305 310 315 320 

lie Val Gly Gly lie Phe Ser Thr Thr Gly Met Leu His Gly lie Gly 

325 330 335 

Lys Phe He Val Glu He He Cys Cys Arg Phe Arg Leu Gly Ser Tyr 

340 345 350 

Lys Pro Val Asn Ser Val Pro Phe Glu Asp Gly His Thr Asp Asn His 

355 360 365 

Leu Pro Leu Leu Glu Asn Asn Thr His 
370 375 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 250 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
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M t Gly Ser Gin His Ser Ala Ala Ala Arg Pro Ser Ser Cys Arg Arg 

15 10 15 

Lys Gin Glu Asp Asp Arg Asp Gly Leu Leu Ala Glu Arg Glu Gin Glu 

20 25 30 

Glu Ala lie Ala Gin Phe Pro Tyr Val Glu Phe Thr Gly Arg Asp Ser 

35 40 45 

lie Thr Cys Leu Thr Cys Gin Gly Thr Gly Tyr lie Pro Thr Glu Gin 

50 55 60 

Val Asn Glu Leu Val Ala Leu lie Pro His Ser Asp Gin Arg Leu Arg 
65 70 75 80 

Pro Gin Arg Thr Lys Gin Tyr Val Leu Leu Ser lie Leu Leu Cys Leu 

85 90 95 

Leu Ala Ser Gly Leu Val Val Phe Phe Leu Phe Pro His Ser Val Leu 

100 105 110 

Val Asp Asp Asp Gly lie Lys Val Val Lys Val Thr Phe Asn Lys Gin 

115 120 125 

Asp Ser Leu Val lie Leu Thr lie Met Ala Thr Leu Lys lie Arg Asn 

130 135 140 

Ser Asn Phe Tyr Thr Val Ala Val Thr Ser Leu Ser Ser Gin lie Gin 
145 150 155 160 

Tyr Met Asn Thr Val Val Ser Thr Tyr Val Thr Thr Asn Val Ser Leu 

165 170 175 

lie Pro Pro Arg Ser Glu Gin Leu Val Asn Phe Thr Gly Lys Ala Glu 

180 185 190 

Met Gly Gly Pro Phe Ser Tyr Val Tyr Phe Phe Cys Thr Val Pro Glu 

195 200 205 

lie Leu Val His Asn lie Val lie Phe Met Arg Thr ser Val Lys lie 

210 215 220 

Ser Tyr lie Gly Leu Met Thr Gin Ser Ser Leu Glu Thr His His Tyr 
225 230 235 240 

Val Asp Cys Gly Gly Asn Ser Thr Ala lie 

245 250 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 374 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 

(D ) TOPOLOGY s linear 

(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Met Val Thr Cys Phe His Val Pro Tyr Ser Ala Leu Thr Met Phe He 

15 10 15 

Ser Thr Glu Gin Thr Glu Arg Asp Ser Ala Thr Ala Tyr Arg Met Thr 

20 25 30 

Val Glu Val Leu Gly Thr Val Leu Gly Thr Ala He Gin Gly Gin He 

35 40 45 

Val Gly Gin Ala Asp Thr Pro Cys Phe Gin Asp Leu Asn Ser Ser Thr 

50 55 60 

Val Ala Ser Gin Ser Ala Asn His Thr His Gly Thr Thr Ser His Arg 
65 70 75 80 

Glu Thr Gin Lys Ala Tyr Leu Leu Ala Ala Gly Val He Val Cys He 

85 90 95 

Tyr lie He Cys Ala Val He Leu He Leu Gly Val Arg Glu Gin Arg 

100 105 HO 

Glu Pro Tyr Glu Ala Gin Gin Ser Glu Pro He Ala Tyr Phe Arg Gly 

115 120 125 

Leu Arg Leu Val Met Ser His Gly Pro Tyr He Lys Leu He Thr Gly 

130 135 140 

Phe Leu Phe Thr Ser Leu Ala Phe Met Leu Val Glu Gly Asn Phe Val 
145 150 155 160 

Leu Phe Cys Thr Tyr Thr Leu Gly Phe Arg Asn Glu Phe Gin Asn Leu 

165 170 175 

Leu Leu Ala He Met Leu Ser Ala Thr Leu Thr He Pro He Trp Gin 

180 185 190 

Trp Phe Leu Thr Arg Phe Gly Lys Lys Thr Ala Val Tyr Val Gly He 

195 200 205 

Ser Ser Ala Val Pro Phe Leu He Leu Val Ala Leu Met Glu Ser Asn 
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210 

Leu lie lie Thr 
225 

Ala Ala Phe Leu 

Phe His Leu Lys 

260 

Ser Phe Tyr Val 
275 

He Ser Thr Leu 
290 

Ser Gin Pro Glu 
305 

Ala Pro He Val 

Pro He Asp Glu 

340 

Leu Arg Asp Glu 
355 

Glu Leu Ala Ser 
370 



215 

Tyr Ala Val Ala 
230 

Leu Pro Trp Ser 
245 

Gin Pro His Phe 

Phe Phe Thr Lys 

280 

Ser Leu Asp Phe 
295 

Arg Val Lys Phe 
310 

Leu He Leu Leu 
325 

Glu Arg Arg Arg 

Ala Ser Ser Ser 

360 

He Leu 



220 

Val Ala Ala Gly 
235 

Met Leu Pro Asp 
250 

His Gly Thr Glu 
265 

Phe Ala Ser Gly 

Ala Gly Tyr Gin 

300 

Thr Leu Asn Met 
315 

Gly Leu Leu Leu 
330 

Gin Asn Lys Lys 
345 

Gly Cys Ser Glu 



He Ser Val Ala 

240 

Val He Asp Asp 
255 

Pro He Phe Phe 
270 

Val Ser Leu Gly 
285 

Thr Arg Gly Cys 

Leu val Thr Met 

320 

Phe Lys Met Tyr 
335 

Ala Leu Gin Ala 
350 

Thr Asp Ser Thr 
365 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 334 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Met Val Asn Asp Pro Pro Val Pro Ala Leu Leu Trp Ala Gin Glu Val 

t; 10 15 
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Gly Gin Val Leu Ala Gly Arg Ala Arg Arg Leu Leu L u Gin Phe Gly 

20 25 30 

Val Leu Phe Cys Thr lie Leu Leu Leu Leu Trp Val Ser Val Phe Leu 

35 40 45 

Tyr Gly Ser Phe Tyr Tyr Ser Tyr Met Pro Thr Val Ser His Leu Ser 

50 55 60 

Pro Val His Phe Tyr Tyr Arg Thr Asp Cys Asp Ser Ser Thr Thr Ser 
65 70 75 80 

Leu Cys Ser Phe Pro Val Ala Asn Val Ser Leu Thr Lys Gly Gly Arg 

85 90 95 

Asp Arg Val Leu Met Tyr Gly Gin Pro Tyr Arg Val Thr Leu Glu Leu 

100 105 HO 

Glu Leu Pro Glu Ser Pro Val Asn Gin Asp Leu Gly Met Phe Leu Val 

115 120 125 

Thr lie Ser Cys Tyr Thr Arg Gly Gly Arg lie lie Ser Thr Ser Ser 

130 135 140 

Arg Ser Val Met Leu His Tyr Arg Ser Asp Leu Leu Gin Met Leu Asp 
145 150 155 160 

Thr Leu Val Phe Ser Ser Leu Leu Leu Phe Gly Phe Ala Glu Gin Lys 

165 170 175 

Gin Leu Leu Glu Val Glu Leu Tyr Ala Asp Tyr Arg Glu Asn Ser Tyr 

180 185 190 

Val Pro Thr Thr Gly Ala lie lie Glu lie His Ser Lys Arg lie Gin 

195 200 205 

Leu Tyr Gly Ala Tyr Leu Arg He His Ala His Phe Thr Gly Leu Arg 

210 215 220 

Tyr Leu Leu Tyr Asn Phe Pro Met Thr Cys Ala Phe He Gly Val Ala 
225 230 235 240 

Ser Asn Phe Thr Phe Leu Ser Val He Val Leu Phe Ser Tyr Met Gin 

245 250 255 

Trp Val Trp Gly Gly He Trp Pro Arg His Arg Phe Ser Leu Gin Val 

260 265 270 

Asn He Arg Lys Arg Asp Asn Ser Arg Lys Glu Val Gin Arg Arg He 

275 280 285 

Ser Ala His Gin Pro Gly Pro Glu Gly Gin Glu Glu Ser Thr Pro Gin 
290 295 300 
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S r Asp Val Thr Glu Asp Gly Glu Ser Pro Glu Asp Pro Ser Gly Thr 
305 310 315 320 

Glu Val Ser Cys Pro Arg Arg Arg Asn Gin lie Ser Ser Pro 

325 330 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 276 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

Met Thr His Pro Gly Thr Gly Asp lie lie Ala Val Met He Thr Glu 

15 10 15 

Leu Arg Gly Lys Asp He Leu Ser Tyr Leu Glu Lys Asn He Ser Val 

20 25 30 

Gin Met Thr He Ala Val Gly Thr Arg Met Pro Pro Lys Asn Phe Ser 

35 40 . 45 

Arg Gly Ser Leu Val Phe Val Ser He Ser Phe He Val Leu Met He 

50 55 60 

He Ser Ser Ala Trp Leu He Phe Tyr Phe He Gin Lys He Arg Tyr 
65 70 75 80 

Thr Asn Ala Arg Asp Arg Asn Gin Arg Arg Leu Gly Asp Ala Ala Lys 

v 85 90 95 

Lys Ala lie Ser Lys Leu Thr Thr Arg Thr Val Lys Lys Gly Asp Lys 

100 105 110 

Glu Thr Asp Pro Asp Phe Asp His Cys Ala Val Cys He Glu Ser Tyr 

115 120 125 

Lys Gin Asn Asp Val Val Arg He Leu Pro Cys Lys His Val Phe His 

130 135 140 

Lys Ser Cys Val Asp Pro Trp Leu Ser Glu His Cys Thr Cys Pro Met 
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145 

Cys Lys Leu Asn 

Cys Thr Asp Asn 

180 

Ala Val Asn Arg 
195 

Leu Gly Leu Glu 
210 

Asp Gly Glu Leu 
225 

Lys Glu Trp Phe 

Leu Cys Tyr Met 

260 

Val Glu Trp Phe 
275 



150 

lie Leu Lys Ala 
165 

Val Ala Phe Asp 

Arg Ser Ala Leu 

200 

Pro Leu Arg Thr 
215 

Thr Pro Arg Thr 
230 

lie lie Ala Ser 
245 

lie lie Arg Ala 



155 

Leu Gly lie Val 
170 

Met Glu Arg Leu 
185 

Gly Asp Leu Ala 

Ser Gly lie Ser 

220 

Gly Glu lie Asn 
235 

Phe Gly Leu Leu 
250 

Thr Ala Ser Leu 
265 



160 

Pro Asn Leu Pro 
175 

Thr Arg Thr Gin 
190 

Gly Asp Asn Ser 
205 

Pro Leu Pro Gin 

He Ala Val Thr 

240 

Ser Ala Leu Thr 
255 

Asn Ala Asn Glu 
270 



(2) INFORMATION FOR SEQ ID NO: 36: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 210 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



Met Ala Asn Ser Gly Leu Gin Leu 

1 5 
Gly Trp Val Gly Leu Val Ala Cys 

20 

Ser Ser Tyr Ala Gly Asp Asn He 
35 40 



Leu Gly Phe Ser Met Ala Leu Leu 

10 15 
Thr Ala He Pro Gin Trp Gin Met 
25 30 
He Thr Ala Gin Ala Met Tyr Lys 

45 
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Gly Leu Trp Met Asp Cys Val Thr Gin Ser Thr Gly Met Met Ser Cys 

50 55 60 

Lys Met Tyr Asp Ser Val Leu Ala Leu Ser Ala Ala Leu Gin Ala Thr 
65 70 75 80 

Arg Ala Leu Met Val Val Ser Leu Val Leu Gly Phe Leu Ala Met Phe 

85 90 95 

Val Ala Thr Met Gly Met Lys Cys Thr Arg Cys Gly Gly Asp Asp Lys 

100 105 HO 

Val Lys Lys Ala Arg He Ala Met Gly Gly Gly He He Phe He Val 

115 120 125 

Ala Gly Leu Ala Ala Leu Val Ala Cys Ser Trp Tyr Gly His Gin He 

130 135 140 

Val Thr Asp Phe Tyr Asn Pro Leu He Pro Thr Asn He Lys Tyr Glu 
145 150 155 160 

Phe Gly Pro Ala He Phe He Gly Trp Ala Gly Ser Ala Leu Val He 

165 170 175 

Leu Gly Gly Ala Leu Leu Ser Cys Ser Cys Pro Gly Asn Glu Ser Lys 

180 185 190 

Ala Gly Tyr Arg Ala Pro Arg Ser Tyr Pro Lys Ser Asn Ser Ser Lys 
195 200 205 

Glu Tyr 
210 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 476 amino acids 

(B) TYPE: amino acid 

^ (C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
Met He Arg Pro Gin Leu Arg Thr Ala Gly Leu Gly Arg Cy 
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15 10 15 

Pro Gly Leu Leu Leu Leu Leu Val Pro Val Leu Trp Ala Gly Ala Glu 

20 25 30 

Lys Leu His Thr Gin Pro Ser Cys Pro Ala Val Cys Gin Pro Thr Arg 

35 40 45 

Cys Pro Ala Leu Pro Thr Cys Ala Leu Gly Thr Thr Pro Val Phe Asp 

50 55 60 

Leu Cys Arg Cys Cys Arg Val Cys Pro Ala Ala Glu Arg Glu Val Cys 
65 70 75 80 

Gly Gly Ala Gin Gly Gin Pro Cys Ala Pro Gly Leu Gin Cys Leu Gin 

85 90 95 

Pro Leu Arg Pro Gly Phe Pro Ser Thr Cys Gly Cys Pro Thr Leu Gly 

100 105 110 

Gly Ala Val Cys Gly Ser Asp Arg Arg Thr Tyr Pro Ser Met Cys Ala 
SQ 115 120 125 

Leu Arg Ala Glu Asn Arg Ala Ala Arg Arg Leu Gly Lys Val Pro Ala 

130 135 140 

Val Pro Val Gin Trp Gly Asn Cys Gly Asp Thr Gly Thr Arg Ser Ala 
145 150 155 160 

Gly Pro Leu Arg Arg Asn Tyr Asn Phe lie Ala Ala Val Val Glu Lys 

165 170 175 

Val Ala Pro Ser Val Val His Val Gin Leu Trp Gly Arg Leu Leu His 

180 185 190 

Gly Ser Arg Leu Val Pro Val Tyr Ser Gly Ser Gly Phe lie Val Ser 

195 200 205 

Glu Asp Gly Leu lie He Thr Asn Ala His Val Val Arg Asn Gin Gin 

210 215 220 

Trp He Glu Val Val Leu Gin Asn Gly Ala Arg Tyr Glu Ala Val Val 
225 230 235 240 

Lys Asp He Asp Leu Lys Leu Asp Leu Ala Val lie Lys He Glu Ser 

245 250 255 

Asn Ala Glu Leu Pro Val Leu Met Leu Gly Arg Ser Ser Asp Leu Arg 

260 265 270 

Ala Gly Glu Phe Val Val Ala Leu Gly Ser Pro Phe Ser Leu Gin Asn 

275 280 285 

Thr Ala Thr Ala Gly He Val Ser Thr Lys Gin Arg Gly Gly Lys Glu 
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290 

Leu Gly Met Lys 
305 

lie Asn Tyr Gly 

Val lie Gly Val 

340 

lie Pro Ser Asp 
355 

Gin Met Lys Gly 
370 

Met Leu Ser Leu 
385 

Pro Asp Phe Pro 

Glu Gly Thr Ala 

420 

Val Asn lie Asn 
435 

Ala Leu Asp Ser 
450 

Asn Leu Leu Leu 
465 



295 

Asp Ser Asp Met 
310 

Asn Ser Gly Gly 
325 

Asn Ser Leu Arg 

Arg Val Arg Gin 

360 

Lys Ala Phe Ser 
375 

Thr Val Pro Leu 
390 

Asp Val Ser Ser 
405 

Ala Gin Ser Ser 

Gly Lys Pro lie 

440 

Asp Ser Leu Ser 
455 

Thr Val lie Pro 
470 



300 

Asp Tyr Val Gin 
315 

Pro Leu Val Asn 
330 

Val Thr Asp Gly 
345 

Phe Leu Ala Glu 

Asn Lys Lys Tyr 

380 

Ser Glu Glu Leu 
395 

Gly Val Tyr Val 
410 

Gly Leu Arg Asp 
425 

Thr Thr Thr Thr 

Met Ala Val Leu 

460 

Glu Thr lie Asn 
475 



lie Asp Ala Thr 

320 

Leu Asp Gly Asp 
335 

lie Ser Phe Ala 
350 

Tyr His Glu His 
365 

Leu Gly Leu Gin 

Lys Met His Tyr 

400 

Cys Lys Val Val 
415 

His Asp Val lie 
430 

Asp Val Val Lys 
445 

Arg Gly Lys Asp 



(2) INFORMATION FOR SEQ ID NO: 38: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 266 amino acids 
- (B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
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Met Val Lys Val Thr Phe Asn Ser Ala Leu Ala Gin Lys Glu Ala Lys 

1 5 10 I 5 

Lys Asp Glu Pro Glu Ser Gly Glu Glu Ala Leu He He Pro Pro ABp 

20 25 30 

Ala Val Ala Val Asp Cys Lys Asp Pro Asp Asp Val Val Pro Val Gly 

35 40 45 

Gin Arg Arg Ala Trp Cys Trp Cys Met Cys Phe Gly Leu Ala Phe Met 

50 55 60 

Leu Ala Gly Val He Leu Gly Gly Ala Tyr Leu Tyr Lys Tyr Phe Ala 
65 70 75 80 

Leu Gin Pro Asp Asp Val Tyr Tyr Cys Gly He Lys Tyr He Lys Asp 

85 90 95 

Asp Val He Leu Asn Glu Pro Ser Ala Asp Ala Pro Ala Ala Leu Tyr 

100 105 HO 

Gin Thr He Glu Glu Asn He Lys He Phe Glu Glu Glu Glu Val Glu 

115 120 125 

Phe He Ser Val Pro Val Pro Glu Phe Ala Asp Ser Asp Pro Ala Asn 

130 135 140 

He Val His Asp Phe Asn Lys Lys Leu Thr Ala Tyr Leu Asp Leu Asn 
145 150 155 160 

Leu Asp Lys Cys Tyr Val He Pro Leu Asn Thr Ser He Val Met Pro 

165 170 175 

Pro Arg Asn Leu Leu Glu Leu Leu He Asn He Lys Ala Gly Thr Tyr 

180 185 190 

Leu Pro Gin Ser Tyr Leu He His Glu His Met Val He Thr Asp Arg 

195 200 205 

He Glu Asn He Asp His Leu Gly Phe Phe He Tyr Arg Leu Cys His 

210 215 220 

Asp Lys Glu Thr Tyr Lys Leu Gin Arg Arg Glu Thr He Lys Gly He 
225 230 235 240 

Gin Lys Arg Glu Ala Ser ABn Cys Phe Ala He Arg His Phe Glu Asn 

245 250 255 

Lys Phe Ala Val Glu Thr Leu He Cys Ser 

260 265 
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